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Intel Corporation is a leading supplier of microcomputer components, 

modules and systems. When Intel first introduced the microprocessor in 1971, 

it created the era of the microcomputer. Today, Intel architectures are considered 

world standards. Intel products are used in a wide variety of applications including, 

embedded systems such as automobiles, avionics systems and telecommunications 

equipment, and as the CPU in personal computers, network servers and 

supercomputers. Others bring enhanced capabilities to systems and networks. 

Intel's mission is to deliver quality products through leading-edge technology. 
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Intel Corporation makes no warranty for the use of its products and assumes no responsibility for any errors 
which may appear in this document nor does it make a commitment to update the information contained 
herein. 

Intel retains the right to make changes to these specifications at any time, without notice. 

Contact your local sales office to obtain the latest specifications before placing your order. 

The following are trademarks of Intel Corporation and may only be used to identify Intel products: 
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INTEL SERVICE 

INTEL'S COMPLETE SUPPORT SOLUTION WORLDWIDE 

Intel Service is a complete support program that provides Intel customers with hardware support, software 
support, customer training, and consulting services. For detailed information contact your local sales offices. 

Service and support are major factors in determining the success of a product or program. For Intel this 
support includes an international service organization and a breadth of service programs to meet a variety of 
customer needs. As you might expect, Intel service is extensive. It can start with On-Site Installation and 
Maintenance for Intel and non-Intel systems and peripherals, Repair Services for Intel OEM Modules and 
Platforms, Network Operating System support for Novell NetWare and Banyan VINES software, Custom 
Integration Services for Intel Platforms, Customer Training, and System Engineering Consulting Services. Intel 
maintains service locations worldwide. So wherever you're using Intel technology, our professional staff is 
within close reach. 

ON-SITE INSTALLATION AND MAINTENANCE 

Intel's installation and maintenance services are designed to get Intel and Intel-based systems and the net- 
works they use up and running—fast. Intel's service centers are staffed by trained and certified Customer 
Engineers throughout the world. Once installed, Intel is dedicated to keeping them running at maximum 
efficiency, while controlling costs. 

REPAIR SERVICES FOR INTEL OEM MODULES AND PLATFORMS 

Intel offers customers of its OEM Modules and Platforms a comprehensive set of repair services that reduce 
the costs of system warranty, maintenance, and ownership. Repair services include module or system testing 
and repair, module exchange, and spare part sales. 

NETWORK OPERATING SYSTEM SUPPORT 

An Intel software support contract for Novell NetWare or Banyan VINES software means unlimited access to 
troubleshooting expertise any time during contract hours — up to seven days per week, twenty-four hours per 
day. To keep networks current and compatible with the latest software versions, support services include access 
to minor releases and "patches" as made available by Novell and Banyan. 

CUSTOM SYSTEM INTEGRATION SERVICES 

Intel Custom System Integration Services enable resellers to order completely integrated systems assembled 
from a list of Intel386™ and Intel486™ microcomputers and validated hardware and software options. These 
services are designed to complement the reseller's own integration capabilities. Resellers can increase business 
opportunities, while controlling overhead and support costs. 

CUSTOMER TRAINING 

Intel offers a wide range of instructional programs covering various aspects of system design and implementa- 
tion. In just three to five days a limited number of individuals learn more in a single workshop than in weeks of 
self-study. Covering a wide variety of topics, Intel's major course categories include: architecture and assembly 
language, programming and operating systems, BITBUS™, and LAN applications. 

SYSTEM ENGINEERING CONSULTING 

Intel provides field system engineering consulting services for any phase of your development or application 
effort. You can use our system engineers in a variety of ways ranging from assistance in using a new product, 
developing an application, personalizing training and customizing an Intel product to providing technical and 
management consulting. Working together, we can help you get a successful product to market in the least 
possible time. 



iny 



DATA SHEET DESIGNATIONS 



Intel uses various data sheet markings to designate each phase of the document as it 
relates to the product. The marking appears in the upper, right-hand corner of the data 
sheet. The following is the definition of these markings: 



Data Sheet Marking 

Product Preview 



Advanced Information 



Preliminary 
No Marking 



Description 

Contains information on products in the design phase of 
development. Do not finalize a design with this 
information. Revised information will be published when 
the product becomes available. 

Contains information on products being sampled or in 
the initial production phase of development.* 

Contains preliminary information on new products in 
production.* 

Contains information on products in full production.* 



^Specifications within these data sheets are subject to change without notice. Verify with your local Intel sales 
office that you have the latest data sheet before finalizing a design. 
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82750DB 
DISPLAY PROCESSOR 



Programmable Video Timing 

— 28 MHz and 45MHz Operating Frequency 

— Pixel/Line Address Range to 4096 

— Fully Programmable Sync, 
Equalization, and Serration 
Components 

— Fully Programmable Blanking and 
Active Display Start and Stop Times 

— Genlocking Capability 

Flexible Display Characteristics 

— 8-, Pseudo 16-, 16-, and 32-Bit/Pixel 
Modes 

— Selectable Pixel Widths of 1.0, 1.5, 
2.0, 2.5, through 14 Periods of the 
Input Frequency 

— Support Popular Display Resolutions: 
VGA, XGA, NTSC, PAL, and SECAM 

— On-Chip Triple DAC for Analog RGB/ 
YUV Output 



— Mix Graphics and Video Images on a 
Pixel by Pixel Basis 

— Real Time Expansion o1 the Reduced 
Sample Density Video Color 
Components (U, V) to Full Resolution 

— Three Independently Addressable 
Color Palettes 

— Programmable 2X Horizontal 
Interpolation of Y Channel 

— 16 x 16 x 2-Bit Cursor Map with 
Independently Programmable 2X 
Expansion Factors in X and Y 
Dimensions 

— YUV to RGB Color Space Conversion 

— 2X Vertical Replication of Y, U, and V 
Data for Displaying Full Motion Video 
on VGA Monitor 

— Register and Function Compatible 
with the 82750DA 




Intel's 82750DB is a custom designed VLSI chip used for processing and displaying video graphic information. 
It is register and function compatible with the 82750DA. 

Reset inputs allow the 82750DB to be genlocked to an external sync source. By programming internal control 
registers, this sync can be modified to accommodate a wide variety of scanning frequencies. A large selection 
of bits/pixel, pixels/line, and pixel widths are programmable, allowing a wide latitude in trading-off image 
quality vs update rate and VRAM requirements. 

The 82750DB can operate in a digitizing mode, wherein it generates timing and control signals to the 82750PB 
and VRAM, but does not output display information. Besides digitizer support signals and video synchroniza- 
tion, the 82750DB outputs digital and analog RGB or YUV information and an 8-bit digital word of alpha data. 
This alpha channel data may be used to obtain a fractional mix of 82750DB outputs with another video source. 
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82750DB Subsystem Diagram 



Intel Corporation assumes no responsibility for the use of any circuitry other than circuitry embodied in an Intel product. No other circuit patent 
licenses are implied. Information contained herein supersedes previously published specifications on these devices from Intel. February 1991 

©INTEL CORPORATION, 1991 . H Order Number: 240855-003 
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1.0 82750DB PIN DESCRIPTION 



Pinout 
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Figure 1-1. 82750DB Pinout 
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Table 1-1. Pin Cross Reference by Pin Name 



Pin Name 


Location 
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Location 
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Table 1-2. Pin Cross Reference by Location 



Location 


Pin Name 


1 


Vss 


2 


Vcc 


3 


DRV[4] 


4 


DRV [3] 


5 


DRV[2] 


6 


DRV[1] 


7 


DRV[0] 


8 


DGY[7] 


9 


DGY[6] 


10 


DGY[5] 


11 


DGY[4] 


12 


DGY[3] 


13 


DGY[2] 


14 


DGY[1] 


15 


DGY[0] 


16 


Vss 


17 


Vss 


18 


DATAIN[0] 


19 


DATAIN[1] 


20 


DATAIN[2] 


21 


DATAIN[3] 


22 


DATAIN[4] 


23 


DATAIN[5] 


24 


DATAIN[6] 


25 ^ 


DATAIN[7] 


26 


DATAIN[8] 


27 


DATAIN[9] 


28 


DATAIN[10] 


29 


DATAIN[11] 


30 


DATAIN[12] 


31 


DATAIN[13] 


32 


Vss 


33 


Vcc 



Location 


Pin Name 


34 


Vss 


35 


Vcc 


36 


DATAIN[14] 


37 


DATAIN[15] 


38 


DATAIN[16] 


39 


Vss 


40 


DATAIN[17] 


41 


DATAIN[18] 


42 


DATAIN[19] 


43 


DATAIN[20] 


44 


DATAIN[21] 


45 


Vcc 


46 


DATAIN[22] 


47 


DATAIN[23] 


48 


Vss 


49 


DATAIN[24] 


50 


DATAIN[25] 


51 


Vcc 


52 


DATAIN[26] 


53 


DATA! N [27] 


54 


DATAIN[28] 


55 


DATAIN[29] 


56 


DATAIN[30] 


57 


Vss 


58 


DATAIN[31] 


59 


VRESET# 


60 


HRESET# 


61 


FCO 


62 


TESTACT# 


63 


TEST# 


64 


FREQIN 


65 


Vcc 


66 


DISDAC 



Location 


Pin Name 


67 


Vcc 


68 


Vss 


69 


BG 


70 


VSYNC 


71 


HSYNC 


72 


CSYNC 


73 


RESETB# 


74 


SCLK[0] 


75 


Vcc 


76 


v S s 


77 


SCLK[1] 


78 


VBUS[0] 


79 


VBUS[1] 


80 


VBUS[2] 


81 


VBUS[3] 


82 


Vcc 


83 


CB 


84 


DISDIG 


85 


BPP[1] 


86 


BPP[0] 


87 


ACTDIS 


88 


ALPHA[7] 


89 


Vss 


90 


ALPHA[6] 


91 


Vcc 


92 


ALPHA [5] 


93 


ALPHA[4] 


94 


Vss 


95 


ALPHA[3] 


96 


ALPHA [2] 


97 


ALPHA[1] 


98 


Vcc 


99 


Vss 



Location 


Pin Name 


100 


Vcc 


101 


v S s 


102 


ALPHA[0] 


103 


DBU[7] 


104 


Vcc 


105 


DBU[6] 


106 


DBU[5] 


107 


DBU[4] 


108 


Vss 


109 


Vcc 


110 


DBU[3] 


111 


DBU[2] 


112 


DBU[1] 


113 


DBU[0] 


114 


DRV[7] 


115 


Vss 


116 


v C c 


117 


v S s 


118 


DRV[6] 


119 


DRV[5] 


120 


PIXCLK 


121 


VGCS 


122 


B.U 


123 


Vcc 


124 


Vss 


125 


AV SS 


126 


RV 


127 


Vcc 


128 


AVcc 


129 


GY 


130 


IREFIN 


131 


Vss 


132 


Vcc 
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Figure 1-2. 82750DB Functional Signal Groupings 
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Quick Pin Reference 



Table 1-3. Pin Descriptions 



Symbol 


Type 


Name and Function 


FREQIN 


I 


FREQUENCY INPUT CLOCK: In riormal use, the 82750DB supplies refresh 
timing for an associated VRAM through the 82750PB. This places a lower limit 
on the line frequency, which is a programmed multiple of FREQIN. It must 
generate enough refresh cycles, so a minimum line rate of 4 kHz is required. 
Furthermore, the 82750PB may run no less than % the speed of the 82750DB, 
since the 82750PB samples the timing and control signals generated by the 
82750DB. The period of FREQIN is known as a "T" cycle. 


RESETB# 


I 


EXTERNAL RESET: Input signal which places all units in the 82750DB into an 
initialized state, and sets the transfer rate to a default value of 1/3X the 
operating frequency. It is an edge sensitive iniput which must be held low for a 
minimum of ten T-cycles. The slowest transfer rate is selected to ensure that 
the 82750DB will read the register information correctly during the first register 
transfer, independent of the speed of the VRAMs. During the reset state, the 
analog video outputs and digital outputs are set to the black level. This will 
occur a maximum of four cycles after RESETB# is set to a zero. This signal is 
also used in conjunction with the TESTACT# input to disable outputs. 


VRESET# 


I 


VERTICAL RESET: By programming a bit in an internal register, the 82750DB 
may be placed in the Genlock mode. If this mode is selected, assertion of 
VRESET# resets all vertical timing to the first line of the next field. It does not 
affect the horizontal timing, but does generate the on-chip end of field signals. It 
is an edge sensitive input that is sampled in the 82750DB at the internal time 
corresponding to the rising edge of FREQIN. If the Genlock mode has not been 
enabled, this signal will have no effect on the sync timing. The 82750DB will 
then operate in a free-running mode. Refer to Chapter 3 for a detailed 
description of genlocking the 82750DB. 


HRESET# 


I 


HORIZONTAL RESET: When in the Genlock mode, this input will reset all of 
the horizontal timing to the start of the line (beginning of horizontal sync). 
HRESET# does not affect vertical timing (except for an up-to one-line delay) or 
any other 82750DB registers. This signal is an edge sensitive input that is 
sampled in the 82750DB at that internal time corresponding to the rising edge 
of FREQIN. As was the case with the VRESET# signal, this input will be 
ignored when not in the Genlock mode. 


VBUS[3:0] 





VDP COMMUNICATION BUS: The 82750DB outputs status and VRAM transfer 
requests over these lines to the 82750PB, for 2 to 16 T-cycles (as programmed 
by the user). Transfer requests can tie up the 82750DB/VRAM, 82750PB/ 
VRAM, or 82750PB/82750DB (VBUS) interfaces for a longer period due to 
VRAM arbitration. When signals are not being sent out, the VBUS has value 
1 1 1 1 , the "null command." 


SCLK[1:0] 





VRAM SHIFT CLOCKS: Transfer requests to the 82750PB cause a VRAM 
address to be set up, and the VRAM serial registers loaded (in the case of 
displaying) or unloaded (in the case of digitizing). These signals are used to shift 
data out of and into the VRAMs. Both signals are identical, and run at a 
maximum rate of 1X of the pixel frequency, except during transfer requests, at 
which time they run at 1 X, 1 /2X, or 1 /3X of the operating frequency of the 
82750DB, as programmed by the user. 


DATAIN[31:0] 


I 


DATA INPUT BUS: This is the input data clocked in from VRAM by the 
SCLK[1:0] signals. The format of the input data is a function of the programmed 
number of bits/pixel and of the type of transfer cycle being executed. Data will 
be sampled internally on the rising edge of FREQIN. 
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Table 1-3. Pin Descriptions (Continued) 



Symbol 


Type 


Name and Function 


FCO 





FRAME CAPTURE ON: This is the output signal which indicates to the digitizer 
that the VRAM serial port has been turned from read mode to write mode. The 
digitizer may then drive the (common) VRAM serial register data I/O pins. FCO 
will be asserted after the programmer specifies digitization, five lines after the 
start of the active vertical display, at the time of HSYNC. This gives the external 
logic time to switch directions of the VRAM serial data bus. This signal will end 
four lines after vertical active stops, at the next HSYNC, to make sure the digitizer 
is off before the next beginning-of-field register transfer. 


HSYNC 





HORIZONTAL SYNCHRONIZATION: Video synchronization signal which is 
asserted at the beginning of every line and ends a programmed time later. (The 
duration of this signal is specified in T-cycles.) 


VSYNC 





VERTICAL SYNCHRONIZATION: Video synchronization signal which can be 
programmed to start (once) and end (once) in every field. (The start and stop 
position may be specified in half-line units.) 


CSYNC 





COMPOSITE SYNCHRONIZATION PULSE: This contains the programmed 
vertical serration and equalization information, as well as horizonal 
synchronization pulses. 


CB 


o 


COMPOSITE BLANKING: This signal can be programmed to end once and start 
once in each line, and end once and start once every field. 


BG 





BURST GATE: This signal starts and stops at user-programmable horizontal 
positions in each line, in a programmable vertical group of lines. The primary use 
of this signal is to provide a "window" during which the BURST output should be 
inserted to generate a baseband NTSC signal. The output frequency is set by an 
integer divisor (0-31) and the rate of the FREQIN clock input. To use this 
effectively, the 82750DB must operate at an integer multiple of the NTSC 3.58 
MHz color subcarrier. The number is programmed in two's complement form in 
the General Control register. 


PIXCLK 





PIXEL CLOCK: This output signals valid data on the DGY, DRV, DBU, GY, RV, 
and BU lines. PIXCLK becomes active one-half of a T-cycle after valid data 
appears on DGY, DRV, or DBU, and coincident with GY, RV, and BU. During 
active display time it is issued at a steady rate of 1 /(T-cycles/pixel) times per T- 
cycle, and otherwise at a steady rate of once per T-cycle. Its duration is one-half 
of a T-cycle, and its rising edge may synchronize with either rising or falling edges 
of FREQIN depending on the pixel frequency. This signal may be used to 
synchronize off-chip processing of the pixel data outputs. 


GY, RV, BU 





ANALOG PIXEL OUTPUTS: These signals are the processed pixel data from the 
82750DB in analog form. During the display, these signals may be programmed to 
output pixel data in either YUV or RGB format. 




Output 
Format 


DGY 


DRV 


DBU 




YUV 
RGB 


Y 
G 


V 
R 


U 
B 


DGY[7:0], 
DRV [7:0], 
DBU [7:0] 





DIGITAL VIDEO OUTPUTS: These are the digital outputs of the GY, RV, and BU 
channels, respectively. They are valid with respect to the rising edge of PIXCLK. 


ALPHA[7:0] 





ALPHA CHANNEL: These 8 bits are used to output a digital value for mixing the 
82750DB output with another video signal off-chip. The alpha channel information 
may be included in the pixel data, or may be output based on a comparison of the 
pixel data with user-programmed values. 


ACTDIS 





ACTIVE DISPLAY: This is the active portion of the display as programmed by the 
user. It is delayed by the pipeline through the 82750DB, which is 5 lines vertically 
and a variable number horizontally, depending on the display mode. 
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Table 1-3. Pin Descriptions (Continued) 



Symbol 



Type 



Name and Function 



BPP[1:0] 



BITS PER PIXEL: During the nonactive display, the user programmed bits/pixel is 
encoded on these lines. During active display, the BPP[0] signal is multiplexed 
with a signal, Cursor Active, which indicates if the cursor data is currently active 
(non-transparent). When the Cursor Active output signal is asserted, this indicates 
that cursor overlay data is currently being output. Also during the active display, 
the BPP[1] signal is multiplexed with a signal, VUGR, which indicates whether the 
82750DB is operating in a graphics or video mode. When the VUGR output signal 
is asserted, this indicates the G, R, and B outputs are derived from the 
subsampled VU data. These pins allow users to latch the BPP[1:0] signals during 
nonactive display time (as indicated by ACTDIS being zero) for post-processing of 
the 82750DB output. The active cursor window on BPP[0] can be used during 
active display, to multiplex in other video streams into the output display. The 
following table illustrates the encoding on the BPP signals. 



Bits/Pixel 


ACTDIS 


BPP[0] 


BPP[1] 


8 











16 








1 


32 





1 





pseudo 16 





..-1 '.."■."." 


1 


8 


1 


Cursor Active 


VUGR 


16 


1 


Cursor Active 


VUGR 


32 


1 


Cursor Active 


VUGR 


pseudo 16 


1 


Cursor Active 


VUGR 



DISDAC 



DISABLE ANALOG OUTPUTS: When this input is active, the Analog Pixel 
Outputs are set to a high-impedance state. 



DISDIG 



DISABLE DIGITAL OUTPUTS: When this input is active, the digital outputs of the 
82750DB will be set to zero. In applications that use only the analog outputs of the 
82750DB, the digital outputs must be disabled. 



TESTACT# 



TEST ACTIVE: Active low signal that is used in conjunction with the RESETB# 
signal to allow the chip to perform one of the following functions: 



RESETB# 


TESTACT# 


82750DB State 





1 


Enter Reset State 


Q 





Enter Reset State 
Tristate All Outputs 
Analog Outputs are Zero 


1 


1 


Normal Operation 


1 





Reserved 



TEST# 



TEST INPUT: This signal must be set to VCC to guarantee correct chip operation. 



VGCS 



INTERNAL VOLTAGE REFERENCE: This signal must be decoupled to AVCC. 



IREFIN 



ANALOG CURRENT REFERENCE: Under normal operation, this signal should be 
tied to a temperature compensated current reference to AVSS. This signal must 
be decoupled to AVCC. 



AVcc 



ANALOG POWER pin provides + 5 Vqc supply to the Digital to Analog Converter. 



AV SS 



ANALOG GROUND pin provides the OV connection to which the analog outputs 
are referenced. This must be connected to VSS. 



Vcc 



POWER pins provide +5 Vqc supply input. 



Vss 



GROUND pins provide the OV connection to which all inputs and outputs are 
referenced. 
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Table 1-4. Input Pins 



Name 


Active 
Level 


Synchronous/ 
Asynchronous 


FREQIN 


HIGH 


Synchronous 


RESETB# 


LOW 


Asynchronous 


VRESET# 


LOW 


Asynchronous 


HRESET# 


LOW 


Asynchronous 


DISDIG 


HIGH 


Asynchronous 


TESTACT# 


LOW 


Asynchronous 


TEST# 


LOW 


Asynchronous 



All output pins have an active level of HIGH, and are 
floated when RESETB# and TESTACT# are set to 
a zero. The exceptions are GY, RV, and BU which 
will be forced to a zero level. 



2.0 ARCHITECTURE 



Overview 

There are 10 units in the 82750DB. Each of the units 
operates independently at the maximum clock rate 
input to the chip. The control information for each 
block is distributed in programmable registers 
throughout the chip. These registers are loaded on 
user-specified lines during the horizontal and vertical 
blanking intervals of the field. The register data that 
was read in from VRAM is passed from block to 
block during the blanking intervals of the display, on 
the same lines that the pixel information is passed 
during the active display. The Functional Block Dia- 
gram is shown in Figure 2-1. 

In order to maximize speed and compensate for pro- 
cessing delays, the chip is heavily pipelined. All in- 
ter-block information is delay-equalized to accom- 
modate the different pipeline lengths in each mod- 
ule. As a result, the total pipeline delay is dependent 
on the number of processing units that are used to 
generate the display. Chapter 4 describes how the 
user programming is affected by these pipeline de- 
lays. 

Each of the units are described in more detail in the 
following sections of this chapter. 



Sync Generation and Timing 

The sync generation and timing block generates all 
of the internal timing and control signals, as well 



as the video synchronization signals. Sync and tim- 
ing information may be derived from two sources: 
from the master clock, in which case the control reg- 
isters on the 82750DB are programmed to provide 
the desired display frequency in terms of periods of 
the master clock (T-cycles), or from the horizon- 
tal and vertical external reset signals. (The latter 
is known as the genlock mode.) Characteristics 
such as line rate, blanking and border intervals, and 
composite synchronization parameters can be in- 
dependently set. Since the 82750DB can be 
reprogrammed once each line, horizontal strips of 
different resolutions can be supported on the same 
display. However, the horizontal strips that can be 
supported are limited by the host processor's re- 
sponse to redefining the bitmap pointers resident on 
the 82750PB. 

The horizontal and vertical display parameters are 
fully programmable. Figure 2-2 illustrates the hori- 
zontal programming parameters. The line starts at 
the programmed start position, with the length of 
half of a line programmed in T-cycles. The length of 
the total line is twice the half-line length. Parameters 
such as horizontal sync start, horizontal sync width, 
horizontal blanking start and stop, and horizontal ac- 
tive start and stop are all specified by the user. Note 
that the border time is not explicitly programmed, but 
is defined as the region of the display line where 
neither active display nor blanking is programmed to 
occur. In order for the 82750DB to function correctly, 
the width of the horizontal active display should be 
programmed such that the end of the horizontal ac- 
tive display coincides with the end of the last dis- 
played pixel. 

Figure 2-3 shows the vertical programming parame- 
ters. The basic unit for vertical programming is in 
units of half lines, with the half-line count for each 
field starting at zero. Where appropriate for a param- 
eter, the count is programmed in units of full lines. 
The length of the complete field is programmed in 
half lines, which makes it convenient for distinguish- 
ing between interlaced and non-interlaced displays. 
(For interlaced displays, the number of half lines is 
odd, for non-interlaced displays, it is even.) The ver- 
tical active and blanking regions may be indepen- 
dently programmed, with the border time defined as 
the region where blanking and active display is not 
on. 

NOTE: 




Sync parameters are completely independent of 
the display parameters. This allows the sync sig- 
nals to be positioned anywhere in the field (even 
during active display). 
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Figure 2-1. 82750DB Unit Level Diagram 
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Start of Horizontal Sync 
A End Of Horizontal Sync 



End Of Horizontal Blanking 



Start Of Horizontal Active 




End Of Horizontal Active Display 
A Start Of Horizontal Blanking 

A End of Line Count 




m All horizontal programming parameters are in periods of the master clock. 

• Border may be eliminated by programming the blanking time to abut the active display. 

• Pixel widths must be an integer divisor of horizontal active width. 



□ 



Active Display 



240855-5 



Figure 2-2. Horizontal Programming Parameters 



Start Of Field >► 

(Half-line count is 



End Of Vertical Blanking 



Start Of Vertical Active • 



End Of Vertical Active - 



Start Of Vertical Blanking - 




(Programmed half- 
line count reached.) 



• All vertical programming parameters shown are in half lines. 

• Vertical border may be eliminated by abutting active display and blanking times. 

• Positioning of vertical blanking and sync are fully independent. 



Programming Blanking 



□ 



Active Display 
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Figure 2-3. Vertical Programming Parameters 
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VBUS Control 

The VBUS controller sends ail 82750DB requests 
for display bitmaps, VRAM refresh, and synchroniza- 
tion information to the 82750PB, at programmable 
times during a field! Transfer requests are scheduled 
to occur on a line basis, so only their vertical position 
(or line) is specified by the user. Other commands, 
like refresh requests, occur every line, and their hori- 
zontal position (or dot position) in the line must be 
specified by the user. Transfer requests are given 
the highest priority by the VBUS control circuit and 
are performed first during a blanking interval. The 
programmer has the responsibility of scheduling the 
line oriented codes, like refresh, so that they do not 
collide with the transfer requests. 

Besides arbitrating the scheduled transfer requests, 
the VBUS controller also reads the data from the 
VRAM shift registers using the two shift clock out- 
puts (SCLK[1:0]). The code corresponding to the 
type of data to be read is asserted for a programma- 
ble number of cycles on the 4-bit VBUS. The 
82750DB then waits a programmable delay before 
reading the data from the VRAM. This delay should 
be long enough to guarantee that the 82750PB has 
completed loading the information into the serial 
shift register of the VRAM. Both signals are off while 
the code causing the transfer cycle is active on the 
VBUS, as well as during the read delay time. Figure 
2-4 illustrates this communication between the 
82750PB and the 82750DB. 



When the delay wait is over, the shift clock outputs 
are activated. The SCLK [1:0] signals' behavior is 
dependent on the transfer rate that the user has se- 
lected— either 1X.1/2X, or 1/3X the operating fre- 
quency. Note that if the RESETB# signal is applied, 
the transfer rate is automatically set to 1 /3X during 
the first automatic register transfer, regardless of the 
state of the transfer rate selection. The transfer rate 
may be changed in the first register transfer after 
RESETB# is set to a logic one value. 

Figure 2-5 illustrates how the SCLKs operate in the 
1X mode in a system. SCLK[1:0] signals will toggle 
between zero and one on the rising edge of 
FREQIN, after an internal logic delay. The data is 
read into the 82750DB on the rising edge of the in- 
ternal clock, one 82750DB clock cycle after the 
SCLK outputs are asserted. Since there are 32 data 
input pins, each SCLK can read in the serial data 
from eight 256 x 4 VRAM memory devices. Adding 
external buffering to the SCLKs (to drive more mem- 
ory) will also add delay to the memory access. The 
delay increase may require more than one T-cycle 
before the VRAM data is valid. In this case, the time 
between the rising edge of the internal 82750DB 
clock that generates the SCLKs and the edge that 
latches the data must be increased. 

There are two solutions, the operating frequency of 
82750DB can be lowered to accommodate a longer 
T-cycle, or the 1/2X SCLK mode may be selected 
(as shown in Figure 2-6). When using the 1/2X 
transfer rate, the data is read into the 82750DB on 
the rising edge of the internal clock, two 82750DB 
clock cycles after the SCLK outputs are asserted. 



Programmable 82750DB Delay 
(2-255 82750PB Cloch CvclM ) 



82750DB 
FREQIN 



^ (2 - 1 5 827S0DB Clock Cycles) > ■ 

^VV\AAAAAAA/\mVYT\ 



Programmable 82750DB VBUS Code Length 



82750DB Samples Data 




The 82750DB initiates The 82750PB must have The 82750PB must have executed 

transfer request finished decoding the the 62750DB transfer request 

VBUS code. (DATA should be In the serial 

shift register of the VRAM.) 



Figure 2-4. 82750PB/82750DB Communication 
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Figure 2-7 illustrates 1 /3X (default) shift clock oper- 
ation that is used during the RESET mode or may be 
programmed by the user. The first word of data is 
latched by the 82750DB on the rising ede of the 
FREQIN that is three T-cycles after the SCLK out- 
puts were asserted. This allows three full 82750DB 



cycles for the VRAMs to output valid data, which 
gives extra margin for applications that need longer 
shift read cycles (due to slower memories or exter- 
nal logic delays) and do not wish to operate the 
82750DB at a slower speed. 



1 T-CYCLE 




SCLKt1:0] 



VRAM 
data 



Figure 2-5. 82750DB 1X Shift Clock Operation 
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Figure 2-6. 82750DB 1/2X Shift Clock Operation 
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Figure 2-7. 82750DB 1/3X Shift Clock Operation 
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When reading data from memory during active dis- 
play, the SCLK[1:0] outputs operate at a rate re- 
quired to support the programmed display rate. This 
rate is determined from the following equation: 



RATE = ■ 



(# of bits/pixel) 



(32-bit/ word) * (# word/fetch) * (#T-cycle/pixel) 

where: # bits/pixel and # T-cycles/pixel are user- 
programmed 

# word/fetch is: 1 

The SCLK[1:0] outputs will be the same frequency 
as the input clock in the 1X shift clock mode, and 
one half the input clock frequency when using the 
1 /2X mode. The frequency will be one third in the 
input clock when using the 1/3X mode. In the 1/3X 
mode the SCLK[1:0] outputs will be high for one 
T-cycle, and low for 2 T-cycles. 

VBUS CODE DESCRIPTION 

When the 82750DB is actively fetching and display- 
ing pixels, VUXFER, BMX/YBMNPX, and REGX are 
typically sent over the VBUS. Of the three codes, 
REGX has top priority, followed by VUXFER, and 
last by BMX/YBMNPX. These commands may be 
programmed to occur each active line during the 
blanking interval for the line just completed. If a reg- 
ister transfer has been programmed for an active 
line, it takes priority and is executed first. Otherwise, 
immediately after the register transfer, any sched- 
uled VUXFER and BMX/YBMNPX commands are 
executed. The programmer has the responsibility for 
verifying that the sum of times required by these 
commands does not exceed horizontal non-active 
display time. The 82750DB will commence fetching 
pixels at the subsequent start of active display. A 
detailed explanation of the different types of VBUS 
commands and their corresponding codes follows. 

Transfer Requests 

The following commands request the 82750PB to 
transfer information from the VRAM array into the 
VRAM shift register. When multiple requests are pro- 
grammed for a given line, they are listed in the priori- 
ty they are sent. When asserting a transfer request, 
the programmer must be aware of two other pro- 
grammed parameters, VBLEN and SCLK delay. 

The VBLEN parameter is a user programmed value 
whose bits lie in the General Control Register. It is 
the length of time, in 82750DB T-cycles, that a par- 
ticular VBUS code will be held at the outputs. It is 
used to ensure that the asynchronously operating 
82750PB chip will have enough time to recognize 
and begin operating on an 82750DB transfer re- 
quest. 



The other parameter the programmer needs to set is 
the SCLK delay. This can be found in the Pixel Con- 
trol Register. It is the number of 82750DB clock cy- 
cles that the DB will wait before clocking in data, out 
of the VRAM, after the initiation of a transfer request 
on the VBUS outputs. 

REGX (0010) This command requests that the 
82750PB transfer 82750DB register information into 
the VRAM shift registers. Besides the automatic 
82750DB register transfer that occurs on the second 
line (line 2) of each field, the programmer can speci- 
fy the next horizontal line on which another register 
transfer is to take place. The transfers may be 
scheduled many times during the field. On the first 
transfer, the 82750PB uses the contents of its 
82750DBc register as the starting address of the 
82750DB register data. On each subsequent ac- 
cess, the programmed pitch value in 82750PB's 
82750DBC-PITCH register is added to the accumu- 
lated start address. The programmer must ensure 
that the data is stored in VRAM at the correct ad- 
dress. Since the pitch remains constant, the longest 
register load will determine the pitch value. 

The VBUS unit performs a vertical checksum on all 
the register information. Each bit in the register word 
undergoes an exclusive-OR with the corresponding 
bit in the previous data word. The 82750DB com- 
pares this information with the user generated 
checksum, which is the last 32-bit data word read 
into the 82750DB during a register transfer. If the 
values do not match, the 82750DB will disable all of 
its digital sync and data outputs, enter the reset 
state, and send a SHUTDOWN code (82750DBSD) 
to the 82750PB over the VBUS[3:0] outputs. If the 
new checksum is correct, the new register values 
will take effect immediately. 

VUXFER (0001) This code is used to request VU 
data, providing new VU data is required by the 
82750DB. This command is issued only on vertically 
active lines (as programmed in the register, not as 
seen on the screen) and possibly the four lines after. 
On each line, a row of V and/or U samples are load- 
ed into the VU interpolator line stores. The pattern of 
requests depends upon the mode in which the VU 
interpolator is operating. In the interlaced VU mode, 
one line of samples for both the V and U compo- 
nents are fetched during each transfer; in the non-in- 
terlaced VU mode, only one line of samples for ei- 
ther the V or U components is fetched. Table 2-1 
illustrates the pattern of requests. M is the pro- 
grammed first vertical active line, and N the last ac- 
tive line. The modes listed have VU transfer re- 
quests following the end of horizontal active of the 
lines specified, stopping with the last line, N + 4. 
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Table 2-1. VU Transfer Request Patterns 



Mode 


Active 
Line 


Request VU Data 


2x Non-interlaced 


M 


Fetch 1st Line of V 




M + 1 


Fetch 1 st Line of U 




M + 4 


Fetch 2nd Line of V 




M + 5 


Fetch 2nd Line of U 




N + 4 


Fetch Last Line of V 


2x Interlaced 


M 


Fetch 1 st Line of V and U 


(Odd and Even 


M + 4 


Fetch 2nd Line of V and U 


Fields) 


M. + 5 


Fetch 3rd Line of V and U 




N + 4 


Fetch Last Line of V and U 


4x Non-interlaced 


M 


Fetch 1st Line of V 




M + 1 


Fetch 1st Line of U 




M + 4 


Fetch 2nd Line of V 




M + 5 


Fetch 2nd Line of U 




M + 8 


Fetch 3rd Line of V 




N + 4 


Fetch Last Line of V 


4x Interlaced 


M 


Fetch 1 st Line of V and U 


(Odd and Even 


M + 4 


Fetch 2nd Line of V and U 


Fields) 


M + 6 


Fetch 3rd Line of V and U 




N + 4 


Fetch Last Line of V and U 



Table 2-2. VU Transfer Request Patterns 
with Line Replicate 



The 82750PB uses another internal pointer to cause 
the VRAM to load the desired VU data into its shift 
registers (incrementing the pointer by a pitch value). 
This command is asserted for a programmable num- 
ber of T-cycles (m), as specified in the Miscellane- 
ous Control register. Then, the 82750DB fetches 
them, tying up the 82750DB/VRAM interface for 
(n + 2) cycles, where n is 1 / 4 the programmable total 
number of 8-bit samples of V and U fetched. Note 
that one extra word, which may overlap the next 
VBUS command, is fetched. 

By setting a bit in the Miscellaneous Control register, 
it is possible to replicate lines of V and U generated 
by the interpolator for the entire field. Since each 
line of VU data is displayed twice, the rate that the 
VU sample map has to be fetched from VRAM is 
reduced by 1 / 2 . Table 2-2 lists the sequence of VU 
loads. 

In some cases, the VU interpolator may cover only a 
portion of the display. In those instances, M in the 
above examples would be the first line that VU inter- 
polation is enabled. N would be the last line that VU 
interpolation is enabled. Regardless of the state of 
the Line Replicate bit, there would be no vertical 
pipeline delay between the loading of the first line of 
samples and the second line of samples. The first 
line of samples would be loaded at M-1, and the 
second line at M. This reduces the delay between 
switching interpolation modes during a single dis- 
play. 



Mode 


Active 
Line 


Request 


2x Non-interlaced 


M 


Fetch 1 st Line of V 




M + 1 


Fetch 1 st Line of U 




M + 4 


Fetch 2nd Line of V 




M + 5 


Fetch 2nd Line of U 




M + 8 


Fetch 3rd Line of V 




M + 9 


Fetch 3rd Line of U 




N + 4 


Fetch Last Line of V 


2x Interlaced 


M 


Fetch 1 st Line of V and U 


(Odd and Even 


M + 4 


Fetch 2nd Line of V and U 


Fields) . 


M + 6 . 


Fetch 3rd Line of V and U 




N +. 4 


Fetch Last Line of V and U 


4x Non-Interlaced 


M 


Fetch 1st Line of V 




M + 1 


Fetch 1st Line of U 




M + 4 


Fetch 2nd Line of V 




M + 5 


Fetch 2nd Line of U 




M + 12 


Fetch 3rd Line of V 




M + 13 


Fetch 3rd Line of U 




N + 4 


Fetch Last Line of V 


4x Interlaced 


M 


Fetch 1 st Line of V and U 


(Odd and Even 


M + 4 


Fetch 2nd Line of V and U 


Fields) 


M + 8 


Fetch 3rd Line of V and U 




N + 4 


Fetch Last Line of V and U 




BMX (0000) This command requests a bitmap. 
BMX (0000) is sent after horizontal active stops, be- 
ginning on the fifth line after vertical active starts, 
and continuing until the fifth line after vertical active 
stops. (There is a vertical pipeline delay of five lines 
through the 82750DB, due to internal timing require- 
ments.) A line programmed to start at line M, wil 
have its first active line displayed at line M + 5. The 
82750PB uses an internal pointer to cause the 
VRAM shift registers to be loaded with pixel values. 
The 82750DB subsequently fetches them as re- 
quired for display. This command is asserted on the 
VBUS for the user-programmed number of T-cycles 
and must be completed before active display begins. 

YBMNPX (0100) This command performs a Y bit- 
map transfer without performing a pitch calculation. 
When the line replicate mode is selected by Bit 22 in 
the Miscellaneous Control register, this code is as- 
serted every other display line so that the same line 
of information can be used twice. 
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Digitizer Commands 

When in the line replicate mode, and digitizing an 
NTSC source (for example, when genlocking an 
NTSC source to a system that uses only a VGA 
monitor), each line of captured data is effectively 
output at twice the rate. Since each line need only 
be stored once in memory (it is duplicted automati- 
cally in the display mode) only one WRDIGI code, 
followed by a WRDIGINP, is sent every other line. 
On alternate lines, two WRDIGINP are sent and will 
select the last address that was written, without in- 
crementing the 82750PB bitmap address pointer. 
This is described in detail in Chapter 3. 

WRDIGI (0011) This command requests a write of 
digitized data. The operation of this command is de- 
pendent upon the external hardware and is dis- 
cussed in the section on genlocking (page 29). If 
digitizing is enabled, this command is asserted on 
the VBUS for a programmable number of T-cycles. 
The pointer is then incremented by a pitch value. 
Since each horizontal line is stored in a single row of 
memory, this pitch value is equal to the horizontal 
resolution, in bytes, for non-interlaced bitmaps. For 
interlaced bitmaps, the pitch value is equal to twice 
the horizontal resolution, in bytes. This allows alter- 
nate lines of data to be skipped over in successive 
fields. 

WRDIGINP (0111) This command allows access 
to digitized data without performing a pitch calcula- 
tion. WRDIGINP (0111) requests that the 82750PB 
perform a transfer request at the last calculated ad- 
dress. Note that oniy a memory transfer cycle is per- 
formed—the pitch value is not added to this ad- 
dress. This will always ensure that the digitized data 
is written into the last selected memory address, in 
case a physical memory boundary has been 
crossed. This command is asserted after the WRDl- 
Gltransfer has completed. 

Refresh and Control Commands 

The following signals are used to pass refresh re- 
quests and control information to the 82750PB. 

DFL (1000) The Display Format Load command is 
a maskable host processor interrupt that can be pro- 
grammed to occur at any time during the display. 
This is used by the 82750PB to transfer the shadow 
register contents into the working register set in the 
VRAM interface. This is useful in supporting split- 
screen-type applications, where it is desirable to 
change the bitmap pointers at some point before the 
end of the display. 



82750DBSD (1001) This command is the 
82750DB Shut Down code. During every register 
transfer, the 82750DB keeps an internal vertical ex- 
clusive-or checksum of the register data as it is read 
onto the chip. The last word of data that is read 
during the register transfer is the user-generated 
checksum. If the two checksums match, operation 
proceeds as normal. If they do not match, the 
82750DB enters the reset state and sends this code 
to the 82750PB. The 82750DB will remain reset until 
the reset pin is asserted and negated by the host 
processor. 

REFRESH (1010) This command asks the 
82750PB to generate up to 15 refresh cycles every 
horizontal line. The 82750DB transfer cycles have a 
higher priority than refresh requests in the 82750PB. 
REFRESH will not be asserted if programmed to oc- 
cur at the same time as a transfer request code. 

Video Synchronization information 

The following codes are used to pass the video line 
and field information from 82750DB to the pixel 
processor. 

VEVEN (1101) This code indicates the start of an 
even (i.e. second) field of a frame. This command is 
sent coincident with line one of each even field. 
When genlocking to an external source (see pg. 29), 
the occurence of a vreset signal during programmed 
horizontal active time will cause the 82750DB to out- 
put a VEVEN code on the VBUS. 

VODD (1100) This code indicates the start of an 
odd (i.e. first or only) field of a frame. This command 
is always sent immediately after RESETB# is neg- 
ated, and coincident with line one of the odd field. 
Similarly, when genlocking, the occurence of a 
vreset signal during any time other than horizontal 
active time will cause the 82750DB to output a 
VODD code on the VBUS. 

HLIN (1110) This code marks every horizontal line 
at a programmable point in the line. HLIN is used by 
the 82750PB to increment its horizontal line counter. 
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Pixel Processing Path 

This logic accepts the 32-bit word from the input 
latch and divides the word into the programmed pix- 
el format. This will result in either four 8-bit pixels, 
two 16-bit pixels, one 32-bit pixel, or an 8-bit pixel 
with an 8-bit alpha value (pseudo 16-bit mode). The 
pixels act as addresses to the color table, or may 
bypass the table completely as described below. 

Pixel information may be mixed with the output of 
the VU interpolator, which outputs interpolated sam- 
ples derived from a reduced sample bitmap. The 
least significant bit of Y or LSB of U can be pro- 
grammed to act as a switch between using the ex- 
plicit pixel value of YUV or using the luminance por- 
tion of the pixel with the VU portion obtained from 
the interpolator. If the value of the LSB of Y (or U, 
whichever is selected) is zero, the pixel data is used. 
If the LSB of Y (or U) is one, the output of the VU 
interpolator is used. Note that if the LSB of Y is used 
as the switch flag, the luminance portion of the word 
will be only 7 bits wide. 

The alpha information is also processed in this 
block. The alpha data may come from one of two 
sources: it may be explicitly coded in the pixel word, 
as is the case in the 32-bit/pixel and pseudo 16-bit/ 
pixel mode, or it may be obtained by comparing the 
Y portion of the pixel with a preprogrammed value 
and outputting one preprogrammed value if they 
match and a different value if they do not match. 
This latter capability is known as Alpha Trap. 



VU Interpolation 

When VU interpolation is enabled by the program- 
mer, and when the display is in the active region, 
"VU data" will be fetched, as required by the inter- 
polator (by the mechanisms discussed previously in 
the section titled "VBUS Code Description"). This 
data has the format V, V, . . . , V, U, U, . . . , U where 
each V or U is 8 bits, and the bytes are grouped into 
32-bit double-words with earliest in lowest order. 
The number, "N", of V bytes and U bytes is the 
same; N is programmed to be either 256 samples, or 
one of 32 to 192 samples in 32-byte increments. 

The first V data and the first U data fetched on the 
first line of VU interpolation supplies the VU value for 
the first active pixel on that line. All the other VU 
pairs that are fetched define values for the grid of 
pixels defined below and to the right of this one by 
the VU expansion factor every other or every fourth 
horizontally and vertically. Most other VU values are 
filled in recursively by interpolation. Wherever there 
is a pixel which lies between two pixels with known 



values, it is given the value of the weighted average 
of the known values. Values are understood to be 
non-negative integers. When the final value is out- 
putted, any fractions are truncated or rounded to the 
closest odd integer according to the programmed 
value of the interpolation round flag. This process is 
iterated until all pixels have assigned color values. If 
the number of VU data samples loaded into the 
82750DB is not enough to cover the active display 
area, then the last data sample will be replicated 
horizontally across the active display window. 

As mentioned previously in the VBUS Control dis- 
cussion, each line of VU data can be used twice by 
setting the Line Replicate bit in the Miscellaneous 
Control register. Also, each horizontal VU sample 
can be replicated by setting the VU Replicate bit in 
the Pixel Control register. This will cause the V and 
U pixels generated by the VU interpolator every pixel 
time to be used twice. This can result in an effective 
8X horizontal expansion, which is useful when hori- 
zontal blanking time is at a premium. This bit affects 
the horizontal interpolation algorithm only, and will 
not affect the line loading sequence for VU during 
the active display. 

When interpolation is turned on by the programmer 
(by specifying a non-zero number of samples to be 
fetched), VU interpolation may nevertheless be dis- 
abled for each pixel if the following conditions are 
met: 

1. Conditional interpolation has been selected by 
the programmer, 

AND 

Either of the two user-programmed conditions: 

a. Switching on the LSB of the U bit has been 
selected, and the lowest-order bit of the U val- 
ue fetched for the upper left pixel in the block 
has value zero. This allows switching to occur 
on a 2 x 2-pixel or 4 x 4-pixel grid, depending 
on the expansion mode the user has selected. 
The full 8 bits of Y and V are used, but the 
usable space of U has been decreased to 7 
bits. 

b. Switching on the LSB of the Y bit has been 
selected, and the low order bit of the Y value 
for the current pixel has a value of zero. 

2. Display of fetched and interpolated VU values 
may also be suppressed by setting the Interpola- 
tion Output Enable bit (in the miscellaneous con- 
trol register) to zero. This will allow VU data to be 
loaded into the VU line stores without displaying 
VU data. This is useful when a mid-screen tran- 
sition is made between two interpolation modes, 
to compensate for the vertical latency of the in- 
terpolation process. 
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Colormap Lookup Table (CLUT) 
Operation 

The 82750DB contains three 256 x 8-bit color look- 
up tables. The color maps can be accessed sepa- 
rately, or may act as one large 256 x 24-bit table. 
The manner in which the tables are addressed is 
determined by the programmed bits/pixel and de- 
pends on whether the pixel is a graphics or video 
pixel. Also each Y, U, and V color table address can 
be masked. The masks can be used in all the bit/ 
pixel modes, but are most useful with the 16-bit/pix- 
el mode. In this mode, the mask allows the YUV 
values to be mapped to 8-bit values instead of 6-5-5. 

Each channel (Y, U, V) has a MASK SET register 
and a MASK DATA register that selects the color 
lookup address bit to be changed and the new value 
of the bit, respectively. A simple mask operation on 
one channel is illustated in Figure 2-8. 

The CLUT address mask operation is determined by 
a logical equation given by: 



Result = (mask set and mask data) | (mask set and data byte) 



For modes that require both, video and graphics to 
pass through the color table, the table can be split 
into two halves: one half for graphics and the other 
for video pixels. By using the SPLITCLUT bit in the 
Miscellaneous Control register, in conjunction, with 
the LSB of Y or U, the color table address is forced 
to either the video table or graphics table automati- 
cally. In this case, the masking operation is still used, 
but the address is forced to either an even or odd 
entry, regardless of the results of the masking oper- 
ation. The flag bit that decides between the two 
types of pixels automatically selects the correct por- 
tion of the GLUT table for a single channel. Note the 
LSB of Y or U selects the proper half of the CLUT for 
that single component. The SPLIT CLUT mode as- 
sures the proper half „of the CLUT is used for all 
three components. 

The color table can be bypassed completely when 
displaying either graphics or video, independent of 
the programmed bits/pixel. This is programmed by 
the user via the VIDEO PASS and GRAPHICS PASS 
bits in the Miscellaneous Control register. Table 2-3 
summarizes the various modes when using the 
CLUT. 



Each bit of the Result byte is determined individually 
by this equation. The Result byte is then further pro- 
cessed in order to produce the CLUT RAM address. 
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Figure 2-8. Mask Operation on CLUT Address 
Table 2-3. CLUT Modes 



Graphics 
Pass 


Video 
Pass 
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SPLITCLUT 


Colormap Address 
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When writing to the CLUT, the most significant byte 
of the data word corresponds to the address, and 
the least significant 24 bits are the YUV data (least 
significant to most significant, respectively). An in- 
dex register is used to allow the 6-bit address to be 
mapped to an 8-bit number. (Refer to Chapter 4 for 
more information.) By resetting the 82750DA Dis- 
able bit, it is possible to make the CLUT look like the 
reduced entry color lookup table on the 82750DA. 

The following paragraphs summarizes the possible 
bit/pixel modes, using the LSB of Y or U switching 
ability and the various graphics and video bypass 
modes. Note that there are modes where the LSB of 
Y or U are not used to switch between graphics and 
video. 



8-BIT/PIXEL GRAPHICS MODE 

This is the graphics-only mode, in which the 8 bits 
are used as inputs to all three color tables. This 
makes the color maps look like a single, 256 x 24-bit 
CLUT and allows 256 unique colors from a palette of 
16 million to be available at any given time. If the 
Graphics Pass bit is asserted, the CLUT will be by- 
passed and the 8-bit values of the Y, U, and V chan- 
nels will be input to each channel of the converter 
matrix. 



8-BIT/PIXEL VIDEO MODE 

When used with subsampled VU information from 
the interpolator, the 8 bits are actually a luminance 
value. The Y portion addresses the Y color table, V 
the V color table, and U the U color table. By using 
the color table, a one-to-one mapping exists, allow- 
ing non-linear transformations to be applied to the 
pixel data to enhance the quality of the reconstruct- 
ed image. By asserting the VIDEOPASS bit in the 
Miscellaneous Control register, the color table can 
be bypassed. 

8-BIT/PIXEL MIXED MODE 

In the 8-bit/pixel mixed mode the LSB of Y or U is 
used as a switch flag to change the index to the 
color tables. When the switch flag is set to a one, 
the Y value corresponds to a luminance value, and 
the VU values are the chrominance information ob- 



tained from the VU interpolator. In this case each 
video component is used as an address to its corre- 
sponding CLUT as described above. When the 
switch flag is set to a zero, the VU values are not 
used and the Y value is used as the address to all 
color tables. These pixels are treated the same as in 
the 8-bit/pixel graphics mode. 

In this mode the applications programmer must en- 
sure that the proper information has been loaded 
into specific areas of the color maps. For example, 
all the video pixels will use the odd address values. 
By restricting the address used in the graphics and 
video mode, two unique maps may coexist in the 
tables. One map is used for non-linear transforma- 
tions on video data, and the other for graphics color 
lookup table applications. 

As illustrated above, the CLUT can be bypassed by 
asserting either or both of the bypass controls. 

PSEUDO 16-BIT/PIXEL GRAPHICS MODE 

In the pseudo 16-bit/pixel graphics mode each 
32-bit data word is made up of two, 16-bit pixel 
words. The 82750DB processes each 16-bit pixel 
word, so that the least significant 8 bits correspond 
to pixel information, and the most significant 8 bits 
are used as alpha information. The 82750DB uses 
the lower 8 bits as inputs to all three color tables. 
This makes the color maps look like a single, 256 x 
24-bit color table. If the Graphics Pass bit is assert- 
ed, the CLUT will be bypassed and the 8-bit values 
of the Y, U, and V channels will be input to each 
channel of the converter matrix. 



PSEUDO 16-BIT/PIXEL VIDEO MODE 

When used with subsampled VU information, the 
least significant 8 bits of the pixel word are actually a 
luminance value. The most significant 8 bits are 
used as alpha information. The VU information is 
generated by the 82750DB interpolator. Each of the 
color maps uses the corresponding 8-bit video com- 
ponent as an addess. By asserting the Video Pass 
bit in the Miscellaneous Control register, the color 
table can be bypassed. 




1-21 



■ JL B 



82750DB 



PSEUDO 16-BIT/PIXEL MIXED MODE 

In this mode the LSB of Y or U is used as switch flag 
to change the index to the color tables. When the 
LSB of Y or U is set to a one, the lower 8-bit value 
corresponds to a luminance value, and the V and U 
values are the chrominance information. In this 
case, each video component of the 82750DB is 
used as a colormap address as described above. 
When the LSB of Y or U is set to zero, the V and U 
values from the interpolator are not used, and the Y 
value is used as the address to all color tables. 



16-BIT/PIXEL GRAPHICS MODE 

The 16-bit pixel word is broken up on the 82750DB 
to yield 6 bits of Y, and 5 bits each of V and U. The Y 
bits are the least significant, and the U bits are the 
most significant. These values are then padded with 
zeros in the lower order bits, to obtain an 8-bit word 
for each pixel component. Each component ad- 
dresses its respective CLUT. However, the Y chan- 
nel may access only 64 unique locations, and 5-bit 
resolution for VU restricts them to 32 unique loca- 
tions each. The address range may be extended by 
using the colormap mask registers to add 2 bits of 
precision in the least significant bits for Y and 3 least 
significant bits each for VU channels. This allows the 
programmer to access all the entries in the color 
table by reprogramming the MASK DATA and MASK 
SET registers during the blanking interval. 



16-BIT/PIXEL VIDEO MODE 

This mode works like the 8-bit/pixel video mode de- 
scribed above, except that the 82750DB has pro- 
cessed the information so that the Y channel con- 
tains the least significant 8 bits of the 16-bit data 
word. The V and U information is generated by the 
VU interpolator. If the SPLITGLUT mode is selected, 
the LSB of the address is forced to an odd entry in 
the three color tables. 



16-BIT/PIXEL MIXED MODE 

When the switch flag is zero, the graphics mode is 
selected and the inputs to the CLUT are the respec- 
tive YUV data in the 6-5-5 format. These pixel values 
are extended by using the colormap masking regis- 



ters. When the switch flag indicates the video mode, 
the lower 8 bits of the 16-bit pixel word and the VU 
values obtained from the interpolar are input to their 
respective CLUTs. If the SPLITCLUT mode is select- 
ed, the LSB of the address is forced to either an odd 
or even entry in the three color tables, depending on 
whether the data is video or graphics information. 

32-BIT/PIXEL GRAPHICS MODE 

Eight bits each of Y, U, and V are used as addresses 
to each segment of the color table. Since the size of 
the addressable color space is not increased, the 
advantage of using the color map is for special ef- 
fects or gamma correction. The most significant 8 
bits of the 32-bit data word are used for the alpha 
channel data. If the Graphics Pass bit is asserted, 
the CLUT will be bypassed and the 8-bit values of 
the Y, V, and U will be input to each channel of the 
converter matrix. 

32-BIT/PIXEL VIDEO MODE 

The Y channel contains the least significant 8 bits of 
the 32-bit data word. The U and V information is 
generated by the VU interpolator. The YUV channels 
are input to their respective color tables. The size of 
the addressable color space is not increased, but 
this can be used to take advantage of a non-linear 
transformation, which may aid in the decompression 
process. The most significant 8 bits of the data word 
are used for the alpha channel data. 

32-BIT/PIXEL MIXED MODE 

When the switch flag is zero, the graphics mode is 
selected, and the inputs to the CLUT are the respec- 
tive 8 bits each of YUV data. These pixel values may 
be masked by using the colormap mask data and 
mask set registers. When the switch flag indicates 
the video mode, the lower 8 bits of the pixel word 
and the VU values obtained from the interpolator are 
input to their respective CLUTs. If the SPLITCLUT 
mode is selected, the LSB of the address is set to 
either an odd or even entry in the three color tables, 
depending on whether the data is video or graphics 
information. The most significant 8 bits of the data 
word are used for the alpha channel data. 
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Y Interpolator 

The Y Interpolator performs a 2X horizontal linear 
interpolation on each line of Y values. When Y inter- 
polation is enabled, the internal pixel clock is twice 
the frequency of PIXCLK output. 

NOTE: 



Table 2-4. Control Bit Settings and 
Resulting Interpolator Output 



If Y interpolation is enabled, then only the integer 
values of pixel times greater than 1X may be 
used. 



The interpolation may be separately controlled for 
both video and graphics pixels, via the Viden and 
Gren bits (bits 12 and 11) of the General Control 
register. A video pixel is defined as one generated 
using VU interpolated values. A graphics pixel does 
not use the VU interpolator. The effects of setting 
the control bits, the 82750DB enable flag, and vid- 
eo/graphics pixel switch (V/G Switch) on the output 
of the interpolator are summarized in Table 2-4. 

Because of the asymmetric nature of the internal 
pixel clock used on 82750DB, the number of T-cy- 
cles between successive Y pixels varies depending 
on the programmed pixel width. When enabled, 
there is a pipeline delay through the Y Interpolator 
equal to the number of T-cycles between each inter- 
nal pixel clock. 

When the interpolator is bypassed as described 
above, there is a fixed delay through this block. The 
V and U data are delayed by one pixel clock to allow 
the chroma data to line up with the luminance data. 
Other control signals, such as the register address 
byte (most significant byte of the 32-bit data word 
read from VRAM), the pixel clock, horizontal and 
vertical active displays, composite blanking, and reg- 
ister load enable signals are also delayed by one 
pixel clock in order to line up with the YUV data. The 
programmer must ensure that the active display tim- 
ing is programmed to take the appropriate delay 
through the Y Interpolator into account. 



82750DB 
Enable 


Viden 


Gren 


V/G 
Switch 


Result 





X 


X 


X 


Interpolator 
Bypassed 










X 


Interpolator 
Bypassed 







1 





Interpolate 
Graphics Pixel 







1 


1 


Do Not 
Interpolate 
Video Pixel 




1 





1 


Interpolate 
Video Pixel 




1 








Do Not ( 
Interpolate 
Graphics Pixel 




1 


1 


X 


Interpolate 
Both Video 
and Graphics 
Pixels 




Cursor 

Hardware support for a 16 x 16-pixel cursor has 
been included on the 82750DB. The cursor is capa- 
ble of providing sharp color transitions, when using 
subsampled VU bitmaps. Software intervention is 
minimized, leaving the host with more processing cy- 
cles to perform other operations. 

Under normal operation, the XY starting display po- 
sition of the cursor is loaded into the Cursor Control 
register during a 82750DB register load. On the dis- 
play line corresponding to the Y start position, the 
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cursor is displayed when the X starting position 
(specified in T-cycles) is reached. On the following 
15 lines, the cursor will be displayed at this X posi- 
tion ever^ line, for both interlaced and non-inter- 
laced displays. 

A normal 82750DB register transfer is used to load 
the entire 16x16x2 bits (16 words of 32 bits each) 
of cursor data. During this register transfer, the cur- 
sor data is distinguished from normal register data 
by placing the Cursor Control register immediately 
before the 16 words of cursor data. When the 
82750DB loads the Cursor Control register, it will in- 
terpret the next sixteen 32-bit words of register data 
as the cursor bitmap, and will disable the other regis- 
ters on the 82750DB from decoding the address 
field of the 32-bit data word. (The checksum of the 
82750DB register data is not performed during the 
loading of the cursor bitmap data.) The cursor bit- 
map will be loaded a line at a time, starting at line 
zero and continuing in sequential order to line 15. 
Each line in the cursor map actually contains sixteen 
2-bit cursor pixels, with the two least significant bits 
corresponding to the first cursor pixel in that line, 
and the two most significant bits corresponding to 
the 16th cursor pixel on that line. Each 2-bit pixel 
may select one of the three Cursor Color registers or 
transparency, according to the format indicated in 
Table 2-5. 

Table 2-5. Cursor Color Registers 



Cursor Pixel 


Output 


00 


Transparency 

(Cursor Pixel Not Displayed) 


01 


Cursor Color Register 1 


10 


Cursor Color Register 2 


11 


Cursor Color Register 3 



Three 24-bit color registers that hold the color infor- 
mation for the cursor may be written to at any time 
during the register load. The cursor may be loaded 
any time during the blanking intervals of the display. 
For displays that do not program the cursor during 
the display, the cursor bitmap may be loaded during 
the vertical blanking interval. 

When the T-cycle count equals the value pro- 
grammed into the X start position of the Cursor Con- 
trol register, the first cursor pixel can be displayed. 



Each 2-bit cursor pixel will select one of the three 
Cursor Color registers or transparency. The 24-bit 
output of one of the three color registers (or the ac- 
tual display pixel data if transparency is used) is in- 
put to the YUV converter. 

The cursor bitmap length is 16 lines, and the width is 
16 pixels. Although the length of the cursor may be 
changed dynamically by chaining register loads to 
update the cursor map, the size of the cursor is de- 
pendent on the type of display. For interlaced dis- 
plays, each line of cursor data will appear on the 
same line of each field. This results in a cursor of 
16 x 32 pixels. For non-interlaced displays, the same 
line of cursor information will appear on the same 
line every field. The cursor in this case will be 16 x 
16 pixels. The size of the cursor may be doubled 
independently in the horizontal and/or vertical direc- 
tion by setting the 2X Horizontal Cursor or 2X Verti- 
cal Cursor bit in the General Control register. In this 
case, no hew data is loaded into the cursor map; the 
data is just replicated in the corresponding dimen- 
sion. Table 2-6 summarizes some of the possible 
cursor sizes. Note that by loading the cursor bitmap 
with different data at the start of every field, cursor 
sizes not listed below may be achieved. 

Table 2-6. Cursor Sizes 



2XHorz. 
Cursor 


2X Vert. 
Cursor 


Display 


Cursor Size 
(in Pixels) 


Off 


Off 


Interlaced 


16x32 


On 


Off 


Interlaced 


32 x 32 


Off 


On 


Interlaced 


16x64 


On 


On 


Interlaced 


32 x 64 


Off 


Off 


Non-interlaced 


16x16 


On 


Off 


Non-interlaced 


32x16 


Off 


On 


Non-interlaced 


16x32 


On 


On 


Non-interlaced 


32 x 32 



There is a complex relationship between the cursor 
and the pixel data especially when using non-inte- 
gral divisors of the pixel clocks. Since the pixel data 
output from the 82750DB pixel path always changes 
coincident with the rising edge of the clock, the cur- 
sor start position must be positioned on the rising 
edge of any period of the pixel clock. The program- 
mer must enforce the corresponding restrictions on 
the start and stop position of the cursor. 
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YUV to RGB Converter 

The following equations give the theoretical relation- 
ship between analog RGB components, R, G, B, and 
analog YUV components, Y, U, V. 



(1a) 
(1b) 
(1c) 



Y = 0.298822 R + 0.58681 6 G + 0.1 1 4363 B 

V = R - Y = 0.701178 R - 0.586816G - 0.114363 B 

U = B - Y = -0.298822 R - 0.586816 G + 0.885637 B 

where: 0.0 < G, R, B < 1.0 
0.0 < Y < 1.0 
-0.701 < V < +0.701 
-0.886 < U < -0.886 



Solving for G, R, B, we can obtain the inverse rela- 
tionship: 

G = Y - 0.509228 V - 0.194888 U (2a) 

R = Y + V (2b) 

B = Y + U (2C) 

where: 0.0 < G, R, B < 1.0 
0.0 < Y < 1.0 
-0.701 < V < +0.701 
-0.886 < U < +0.886 

The luminance channel for the YUV inputs is pre- 
sumed to swing between 0.0V and 1 .0V. However, 
the chroma components do not and need to be nor- 
malized to a 0V to 1V range. The offset binary en- 
coding used to obtain unsigned numbers must also 
be accounted for. This encoding should center the V 
and U inputs at the midpoint of the voltage range. 
The equations for the normalized version of Y, V, 
and U (Y\ V, and U' respectively) are: 



0.5V 

V = + 0.5 

0.701 




0.5U 

U' = + 0.5 

0.886 




where: 0.0 < Y\ V U' 


< 1.0 


0.0 < Y < 1.0 




-0.701 < V < 


+ 0.701 


-0.886 < U < 


+ 0.886 



(3a) 



(3b) 



(3c) 



When converting the normalized analog values Y\ 
V, U' to digital y, v, u values, the D.C. offset and 
conversion ranges are compatible with the CCIR 
601 standard for digital video. The ranges for the 
components and the corresponding Digital to Ana- 
log equivalent equations are given below: 



y = (235 - 16) Y' + 16 

where: 16 < y < 235 

v = (240 - 16) V + 16 

where: 16 < v < 240 

u = (240 - 16) U' + 16 

where: 16 < u < 240 



(4a) 



(4b) 



(4c) 




Substituting the normalized analog voltages of 
Equation 3 into Equation 4, we obtain the digital ver- 
sion of the input data, used in the DVItm Technology 
system: 



' (219) Y + 16 



0.701 



1121) 



■ + 128 



(5a) 



(5b) 



(5c) 



0.886 

where: 0.0 < Y < 1 .0 

-0.886 < U < 0.886 
-0.701 < V < 0.701 
16 < y < 235 
16 < v, u < 240 

By solving equations 5 for Y, U, V, and substituting 
into Equation 2, we get the relationship between an- 
alog R, G, B and the digital DVI y, u, v data: 

G = 0.004566 y - 0.003187 v - 0.001541 u + 0.532242 (6a) 
R = 0.004566 y + 0.006259 V - 0.874202 (6b) 



B = 0.004566 y + 0.007911 u - 1.085631 

where: 0.0 < R, G, B < 1.0 
16 < y < 235 
16 < v, u < 240 



(6c) 
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If the inputs of the Digital to Analog Converter are 
scaled to accommodate the nominal input range of 
to 219, we obtain the following relationship between 
the inputs to the DVI Technology system, (y, v, u) 
and inputs to the Digital to Analog Converters (r, g, 
b). Note that all out of range RGB values (■> 255 or 
< due to excursions in the inputs) are clipped to 
255 or 0. 

g = y - 0.698001 v - 0.337633 u + 116.56116 (7a) 
r = y + 1.370705 V - 191.45029 (7b) 

b = y + 1.732446 u - 237.75314 (7c) 

where: 16 < y < 235 

16 < v, u < 240 
< g, r, b < 255 

By substitution of Equation 5 into Equation 1, and by 
converting G, R, and B to digital values, we can ob- 
tain the inverse relationship of Equation 7: 

y = 4-0.298822 r + 0.586816 g + 0.114363b + 16 (8a) 

u= -0.172486 r - 0.338721 g + 0.511206 b +128 (8b) 

jpv =. + 0:51 1 545 r - 0.4281 1 2 g - 0.083434 b + 1 28 (8c) 

where: 16 < y < 235 

16 < v, u < 240 
< g, r, b < 255 

Output Equalization 

The units on the 82750DB process the pixel informa- 
tion at the operating frequency of the chip. If the 
output pixel rate is not equal to the maximum fre- 
quency, the units have null states during which pro- 
cessing is suspended. This type of operation is nec- 
essary on the 82750DB because of the large 
amount of pipelining. Table 2-7 gives the pattern of 
T-cycles on the 82750DB during which processing is 
active, according to the programming shown in Ta- 
ble 4-2. 

The pixel information must be output at a rate that is 
some sub-multiple of the operating frequency. The 
divisor is programmed by the user, and may be from 
1 to 12 times slower than the period of FREQIN, in 
increments of 1 / 2 . Divisors of 13 and 14 are also pro- 
grammable. Because non-integral divisors are used, 
it is necessary for the 82750DB to output different 
information on both phases of FREQIN. This is illus- 
trated in Figure 2-9, which uses a 2.5 divisor for the 
clock. Notice that the pixel clock output (PIXCLK) 



transitions fall alternately on the active and inactive 
phase of the input frequency, while the internal pixel 
clock transitions always occur on the active phase. 
Also note that PIXCLK does not have a 50% duty 
cycle. 

The equalizing logic derives a clock that has a peri- 
od equal to the programmed pixel rate, providing an 
edge to sample the output information. This allows 
the Digital to Analog Converter to directly sample 
the output of the pixel data path before performing 
the analog conversion. 

Table 2-7. 82750DB Active T-Cycle Patterns 



Pixel Time 
(T-Cycles) 


Pattern Of Internal 
Pixel Clock 


1 


Always On 


1.5 


1 On/1 On/1 Off 


2 


1 On/1 Off 


2.5 


1 On/1 Off/1 On/2 Off 


3 


1 On/2 Off 


3.5 


1 On/2 Off /1 On/3 Off 


4 


1 On/3 Off 


4.5 


1 On/3 Off/1 On/4 Off 


5 


1 On/4 Off 


5.5 


1 On/4 Off/ 1 On/5 Off 


6 


1 On/5 Off 


6.5 


1 On/5 Off /I On/6 Off 


.. , 7 


1 On/60ff 


7.5 


1 On/6 Off/1 On/7 Off 


8 


1 On/7 Off 


8.5 


1 On/7 Off /1 On/8 Off 


9 


1 On/8 Off 


9.5 


1 On/8 Off /1 On/9 Off 


10 


1 On/9 Off 


10.5 


1 On/9 Off/1 On/10 Off 


11 


1 On/10 Off 


11.5 


1 On/10 Off/1 On/11 Off 


12 


1 On/11 Off 


13 


1 On/ 12 Off 


14 


1 On/13 Off 
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Figure 2-9 Divide by 2.5 Pixel Clock 



Digital to Analog Converters 

The Digital to Analog Converters (DACs) take three 
channels of video information output from the pixel 
data path, converting it from 8-bit digital values to 
analog voltage levels typically between OV and 1V. 
The conversion is monotonic, and a pixel clock is 
used to derive a two-phase clock internal to the 
DAC. The data is sampled from the output of either 
the pixel path, or the YUV to RGB matrix on the 
rising edge of the internal active phase of this clock. 
The DISDAC input pin can be asserted to disable the 
analog outputs and place them into a high-imped- 
ance state. 



The analog outputs of the triple DAC are referenced 
to an external current source, which must be con- 
nected to the IREFIN pin. All the analog outputs are 
scaled by this current reference. The value of the 
analog output full scale is as follows: 



Ifs = Iref 



, 255 
18.5 



where: Iref is the magnitude of the reference 
current. 

The output voltage generated at full scale is: 

Vfs = Ifs* Rext 
Rext is the load resistance value. 

A typical output load for the analog outputs (RV, BU, 
GY) is 75Q. The speed of the DAC analog output 
rise and fall times is determined by the time con- 
stant: 

Rext * (Cext + Cout) 



where: Cext is the external capacitance applied and 
Cout is the intrinsic capacitance of an ana- 
log output. 

For high performance the objective would be to 
minimize Rext and Cext. The voltage Voutfs can be 
determined by any combination of Ifs and Rext, but 
must not exceed 1 .5V. In addition Ifs must not ex- 
ceed 22 mA. The analog outputs must go through 
an external buffer to drive doubly-terminated 75Q 
coax line. 

Table 2-8 lists pins which are used to configure the 
triple DAC. 

Table 2-8. Digital To Analog Converter Pins 



Signal 


Description 


IREFIN 


Analog Current Reference. Must Be 
Decoupled to AVCC. 


VGCS 


Internal Voltage Reference. Must 
Be Decoupled to AVCC. 


AVcc 


Analog Power 


AV SS 


Analog Ground 


GY, RV, BU 


Analog Pixel Outputs 


DISDIG 


Disable Digital Outputs 


DISDAC 


Disable Analog Outputs 



NOTE: 



The digital video outputs must be disabled by 
setting DISDIG high whenever the analog out- 
puts are used. Otherwise the AC and DC. char- 
acteristics of the DAC are not guaranteed. 
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3.0 HARDWARE INTERFACE 



82750DB Reset Operation 

Upon power-up, the 82750DB is in an indeterminate 
state and must be reset. The RESETB# signal as- 
serted by the host processor is sampled on the ris- 
ing edge of FREQIN. The 82750DB will enter the 
reset state a maximum of four cycles after 
RESETB# is sampled. The 82750DB will request 
the 82750PB to generate VRAM refresh cycles by 
asserting a REFRESH code on the VBUS for 16 T- 
cycles. This code is repeated every 256 T-cycles, 
until RESETB# is negated. 

NOTE: 



The RESETB* input is an edge-triggered input. 
After power-up, the host processor must set the 
RESETB* input low for a minimum of ten T-cy- 
cles in order to reset the 82750 DB. The host 
must then set the RESETB# input high to start 
normal operation. 



When the RESETB# input is released, a Start of 
Vertical Field command (VODD) is sent for 1 6 T-cy- 
cles to the 82750PB via the VBUS. This code is im- 
mediately followed by a Register Transfer Request 
command (REGX) that is held for 256 T-cycles. This 
256 T-cycle wait assures that the 82750PB has am- 
ple time to honor the 82750DB register transfer re- 
quest. The register data is then read into the 
82750DB from the serial port of the VRAMs at a rate 
that is equal to 1 / 3 of the operating frequency. If the 
register transfer does not terminate after 256 T-cy- 
cles, the 82750DB will automatically stop the trans- 
fer, send an 82750DBSD code to the 82750PB, and 
re-enter the reset state. 



the beginning of a horizontal line and at the begin- 
ning of the first field sometimes referred to as line 1 
of field 1. There will not be a horizontal sync pulse 
on the first line after reset, but HSYNC will be gener- 
ated on every line thereafter. All horizontal and verti- 
cal programming parameters as well as scheduling 
of any transfer requests and control information to 
be sent on the VBUS must be set up by the user 
during the first register load. Included in the control 
information are parameters for the 82750PB to re- 
fresh the VRAM. Refresh must occur on every line. 
This requires that the line rate of the 82750DB must 
be at least 4 kHz to guarantee that enough refresh 
cycles are generated. Additional register transfers 
(up to one per line) may be programmed to occur on 
any line during the field. As a result of this transfer 
display characteristics and programming parameters 
may be changed. 

After the first field, automatic register transfers will 
occur on the second line of each subsequent field. 
Note that all register transfers will occur at 1/3 of 
the operating frequency of the 82750DB, unless the 
1X or 1/2X SCLK mode has been programmed by 
the user. 

Throughout the reset process, the states of all out- 
puts become valid at various times. Specifically, af- 
ter being held low for at least 10 T-cycles, 
RESETB# must transition to a high state in order 
to initiate normal operation. By the time RESETB# 
reaches this low to high transition, the states of 
SCLK[1:0], VBUS[3:0], HSYNC, VSYNC, CSYNC, 
and FCO are valid. 10 T-cycles following 
RESETS #'s transition from iow to high, the states of 
BG, CB, ACTDIS, PIXCLK, DGY[7:0], DRV[7:0], and 
DBU[7:0] become valid. ALPHA[7:0] and BPP[1:0] 
signals reach a valid state 10 T-cycles following the 
completion of the first register load following reset. 



During this register transfer, and on all subsequent 
register transfers (programmed or automatic), the 
82750DB performs a vertical checksum on the regis- 
ter data. The last 32-bit word read in during a regis- 
ter transfer is the user-generated checksum of that 
register data. If the 82750DB-generated checksum 
error does not match the user-generated checksum, 
the 82750DB sends a SHUTDOWN code to the 
82750PB via the VBUS, and will automatically re-en- 
ter the reset state. The 82750DB will remain in the 
reset state until the RESETB# input is toggled by 
the host processor. Any VRAM requests or control 
signals programmed to occur during this time will be 
ignored. 

Normal programmed operations start after the first 
successful register load. Frame timing will start at 



Input/Output Transformation 

In general, the control outputs, including the sync 
signals, are delayed by pipelining effects from their 
corresponding inputs. If the output sync signals are 
taken as the time base, the first pixel in a line is 
actually fetched by an SCLK that is up to 1 9 T-cycles 
before its corresponding PIXCLK. Some later pixels 
may be delayed by an additional number of T-cycles, 
depending upon bits/pixels, pixel timing, and wheth- 
er Y interpolation is enabled. 

Outside of the active display region and before the 
blanking output is asserted, border pixels are output. 
Where the blanking region has been entered and the 
display is not active, the output is the value con- 
tained in the Blanking Color register. 
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Pixel handling in the active region is defined by three 
parameters: 

1. The bits/pixel parameter. 

2. Whether VU interpolation is in effect or not. 

3. If the 82750DB Enable bit has been selected. 

VU interpolation is in effect for a given pixel if: 

1 . The VU interpolator is turned on (VU sample load 
set to non-zero load value), 

AND 

2. VU interpolation display is permitted (VU interpo- 
lation display operations bit equals 1), 

AND 

3. One of the two following conditions is met: 

a. Either the interpolation is unconditional, 
OR 

b. The controlling Y or the controlling U sample 
for this pixel has a least significant bit of 1 . 

The value of the alpha output may come from one of 
the following three sources: 

1 . It may be explicitly coded into the pixel data (32- 
bit/ pixel and pseudo 16-bit/pixel with Alpha 
modes only). 

2. It may be output from one of two programmable 
registers, AlphaO and Alphal . 

3. During the portion of the display when the border 
is active, the 8 most significant bits of the Border 
Alpha register may be output. 

Table 3-1 illustrates how the Alpha outputs are se- 
lected. 

Table 3-1. Selecting Alpha Outputs 



Alpha 
Enable 


Alpha 
Trap Select 


Alpha Output 





X 


AlphaO Register 


1 





AlphaO Register 
(8, 16bpp) 


1 





MS Byte of Pixel 
(32, Pseudo 16 bpp) 


1 


1 


Trap Match = 0, 
AlphaO Register 


1 


1 


Trap Match = 1 , 
Alphal Register 



Genlocking on the 82750DB 

The genlocking algorithm on the 82750DB uses hori- 
zontal and vertical resets, HRESET# and 
VRESET#, obtained from an external device. When 
the Genlock bit in the Miscellaneous Control register 
is off, the 82750DB will ignore all signals present on 
it's HRESET# and VRESET# inputs. The 82750DB 
will resync itself when the programmed end of line 
count is received. This allows the user to turn off 
genlock without having to worry about the state of 
the input video. 

When the Genlock bit is set to one, the 82750DB will 
use the external resets to reset its internal horizontal 
and vertical sync counters. In this case, the width of 
the active line is determined by the HRESET# sig- 
nal, and the length of the field is governed by 
VRESET#. The programmed values for these reg- 
isters will be ignored. As shown in Figure 3-1, 
when asserted VRESET# and HRESET# are ef- 
fected just after the third falling edge of FREQIN. 
VRESET# has no effect on the 82750DB if the first 
half of the first line of an odd field or the second (and 
only) half of the first line of an even field is already in 
progress. HRESET# has no effect on the 82750DB 
if it occurs during the programmed first half of the 
line. The user may decrease the effect of jitter by 
reducing the "window" during which the vertical re- 
set signal is supposed to occur. This can be done by 
scheduling a register load to occur after the vertical 
active display time has ended, thereby decreasing 
the programmable horizontal active window to a size 
acceptable for the video source. When VRESET# is 
received during this reduced, programmed hori- 
zontal active window, the 82750DB is reset to an 
even vertical field. When VRESET# occurs at any 
other time in the horizontal scan line, the 82750DB 
is set to an odd field. 
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Figure 3-1 . Horizontal and Vertical Reset Timing 



Digitizing Images with the 82750DB 

Digitizing is enabled by setting the Digitize Enable bit 
in the Miscellaneous Control register. Note that en- 
abling the digitize mode does not automatically en- 
able genlocking. The Genlock bit must be set sepa- 
rately, if it is required. When digitizing, the 82750bB 
is used to shift digitized data into the VRAM shift 
registers, and then transfer this data into the VRAM 
array. 

The 82750DB also provides an external "digitizer 
window" signal, FCO. This signal defines the vertical 
active region that the digitizer enabled. Typically, the 
user sets up the display parameters to reflect the 
"window" of the display to be digitized. The horizon- 
tal and vertical active window size can be selected 
by programming the Active Start and Stop registers. 
FCO is derived from the Vertical Start and Stop reg- 
isters, and is used to enable the digitizer to drive the 
VRAM bus. During the programmed vertical blanking 
interval the FCO signal will be negated, and there- 
fore, the digitizer is prohibited from driving the VRAM 
bus. This will allow data to be read from the VRAM 
serial data bus during the automatic register transfer 
that is performed at the start of the field. Note that it 
will still be possible to program the 82750DB to digi- 
tize during the vertical blanking interval, in order, for 
example, to capture time codes from a VCR. 



When capturing and displaying NTSC data during 
the horizontal blanking interval of the first display 
line, a WRDIGINP command is sent on the VBUS to 
the 82750PB. (Refer to Figure 3-2.) Recall that there 
is a 5-line vertical pipeline delay through the 
82750DB. If the first display line is programmed to 
be n, the first display line will occur at n + 5. Similar- 
ly, if the last line is programmed to be m, then the 
last display will be line m + 5. The WRDIGINP 
VBUS code causes a dummy write transfer cycle 
that places the VRAMs in the write mode. The 
82750PB then sets the bitmap pointers to the first 
line's address (L0). This code is immediately fol- 
lowed by another WRDIGINP command that causes 
the 82750PB to perform a write transfer cycle at the 
L0 address. Since no digitized data has been read 
in, invalid data is loaded into row L0 of the VRAM 
array. 

During the active display of the first display line, the 
82750DB provides shift clocks at the programmed 
pixel rate. The digitized data is shifted into the 
VRAMs while the user-programmed horizontal active 
window is active. During the horizontal blanking in- 
terval of the next line, the 82750DB sends a WRDIGI 
code to the 82750PB, thereby transferring the L0 
data from the shift register to the VRAM array at the 
L0 address. The 82750PB performs a pitch calcula- 
tion, pointing it to the L1 row. After the WRDIGI 
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Figure 3-2. Digitizing Example 



transfer has finished, the 82750DB issues a 
WDIGINP command to the 82750PB that performs a 
write transfer cycle at L1 address. This will write the 
LO data into the L1 address. The next line the L1 row 
will be written over with L1 data. This same proce- 
dure continues for the entire active display, until the 
last active line is reached (m + 5). A final pair of 
WRDIGI and WRDIGINP codes are sent to the 
82750PB to load in the last line of data. At the start 
of horizontal sync of the next line, the FCO signal 
will be negated. 

The purpose of the WDIGINP may not be apparent 
at first glance. This signal ensures that the correct 
data is written into the last selected VRAM address. 
This is necessary when crossing the physical bound- 
aries of VRAM memory. 

When the 82750DB is genlocked, the digitizing 
device must also provide the HRESET# and 
VRESET# signals. The device must ensure that 
VRESET# is never asserted during the start of the 
line. This allows a register transfer (which shortens 
the active display and is required for digitizing) to 
complete before the start of a field register transfer. 



The vertical sync pulses are buffered, so the start of 
the field transfer request can be honored immediate- 
ly after the previous transfer request is finished. 

Also, captured NTSC data may be displayed on a 
VGA-type monitor. This requires the 82750DB to op- 
erate at a VGA frequency (approximately 31.5 kHz), 
which is twice that of NTSC. Each line of captured 
NTSC data is read into the 82750DB twice. Setting 
the line replicate bit makes doubling of memory un- 
necessary. Figure 3-3 illustrates how the 82750DB 
operates in such a mode. The Line Replicate, Digitiz- 
er, and Genlock bits in the Miscellaneous Control 
register are assumed to be set to one. During the 
HBI of the first display line, a dummy write transfer 
cycle (WRDIGINP) places the VRAMs in the write 
mode. The 82750PB then sets the bitmap pointers 
to the first line's address (LO). This code is immedi- 
ately followed by a WDIGINP command, causing the 
82750PB to perform a write transfer cycle at the LO 
address. Since no digitized data has been read in, 
unknown values are loaded into row LO of the VRAM 
array. 
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Figure 3-3. Digitizing Example with Line Replicate 



At the end of the first line the 82750DB sends two 
WRDIGINP codes to the 82750PB, thereby transfer- 
ring the LO data from the shift register to the VRAM 
array at the LO address. The 82750PB does not per- 
form a pitch calculation, so the pointer remains at 
the address for LO. After the second display line 
(which has the same data as the first line), a 
WRDIGI code is sent to the 82750PB that writes the 
LO data to the LO address and updates the bitmap 
pointer to L1 . The WRDIGINP signal immediately fol- 
lowing this selects the L1 address. After the third 
line of data, two WRDIGINP codes that select 



the L1 address are sent. After the fourth line, (which 
has the same data as the third line) a write operation 
is performed to load L1 data into the L1 address, 
and the 82750PB pointer is updated to address L2. 
A WRDIGINP code is sent to select the L2 address. 
This same procedure continues for the entire active 
display, until the last active line is reached (m + 5). 
A final pair of WRDIGI and WRDIGINP or two 
WRDIGINP codes are set to the 82750PB to load in 
the last line of data. At the start of horizontal sync of 
the next line, the FCO signal will be negated. 
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4.0 PROGRAMMING THE 82750DB 



Overview 

All registers are loaded by the issuance of a REGX 
command from the 82750DB to the 82750PB over 
the VBUS. This causes the 82750PB to load a se- 
quence of register values into the VRAM serial out- 
put registers from an address designated by a 
82750DB register pointer. After the request is grant- 
ed, a new 82750DB register word is read in with 
each SCLK. Each 32-bit word consists of a register 
address in the high byte and register values in the 
rest of the word. The sequence is terminated by a 
stop code that corresponds to the address byte be- 
ing equal to Oxff. A variable number of 32-bit words 
can be loaded. During reset, if a stop bit is not found 
within 256 T-cycles, the register transfer is terminat- 
ed, a SHUTDOWN code is asserted on the VBUS, 
and the 82750DB returns to the reset state. All 
transfer requests are terminated at the start of a new 
field. This ensures that non-terminating register 
transfers caused by bad register data will be halted. 

During this register transfer, and on all subsequent 
register transfers (programmed or automatic), the 
82750DB performs a vertical checksum on the regis- 
ter data. The last 32-bit word read in during a regis- 
ter transfer is the user-generated checksum of that 
register data. If the 82750DB-generated checksum 
error does not match the user-generated checksum, 
the 82750DB sends out a SHUTDOWN code to the 
82750PB via the VBUS, and will automatically re-en- 
ter the reset state. 



Pipeline Delay through the 82750DB 

The actual horizontal pipeline delay through the 
82750DB is dependent on processing elements 
used to generate the output. If Y interpolation is not 
used, the pipeline delay is: 

Horiz. Active Pipeline Delay = 16 cycles + 
SCLK Transfer Timing Delay 

Here the SCLK Transfer Timing Delay is 1 for 1X, 2 
for 1/2X, and 3 for 1/3X. 

If Y interpolation is used, the pipeline delay is: 

Horiz. Pipeline Delay = 16 cycles + 
SCLK Transfer Timing Delay + Integer (Pixel Time) 

The integer (Pixel Time) is simply the integer value 
of the programmed pixel time. The horizontal pipeline 
delay for blanking differs from that of active. When y- 
interpoloation is on or off, the pipeline delay for hori- 
zontal blanking is: 

Horiz. Blanking Pipeline Delay = 10 cycles + 
SCLK Transfer Timing Delay 
The horizontal sync pipeline delay is always equal to 
cycles. 

Thus all horizontal parameters, (e.g. horizontal 
blanking start, active stop) must be programmed to 
account for the total horizontal pipeline delay. The 
vertical pipeline delay. The vertical blanking and 
vertical sync pipeline delay are always equal to 
lines. All vertical parameters must be programmed 
so that this delay is taken into account. 
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PROGRAMMING CONSIDERATIONS 

The user must ensure that the 82750DB is pro- 
grammed correctly. Illegal or illogical combinations 
of display parameters are not corrected in hardware, 
and may cause the 82750DB to output erroneous 
display or timing information. The following list high- 
lights some basic guidelines to follow when pro- 
gramming the 82750DB. 

1 . The maximum rate that data may be read into the 
82750DB is determined by the type of memory 
used. This in turn effects the maximum rate and 
depth of data that can be displayed. If 32 bits of 
data can only be read into the 82750DB every 
two clock cycles, only 16 bits of data may be dis- 
played every clock cycle. The programmer 
should match the transfer rate (1X, 1/2X, or 
1/3X) with the memory speed, and the display 
pixel rate with the pixel depth and memory band- 
width. 

2. Blanking intervals of the display are defined by 
the non-active programmed time. During this por- 
tion of the display, programmed transfers take 
place. If a transfer does not complete before the 
start of the active display, it is terminated, and 
active display data is shifted into the 82750DB at 
the programmed rate. During horizontal blanking 
intervals, the user should allow enough time for 
all programmed register, colormap, and VU data 
transfers to complete. 

3. When digitizing (capturing) images, no other bit- 
map transfers (e.g., REGX.VU) should be sched- 
uled to occur during the active portion of the field. 

4. Active start and stop times .should not be pro- 
grammed to overlap the blanking stop and start 
times, taking the pipeline delay through the 
82750DB into account. 

5. Programming the Y interpolation to occur in a 
non-integral pixel width will cause the Y channel 
to output incorrect data. 



CURSOR REGISTERS 

The following registers are used to program the 
characteristics of the on-chip cursor. 



Cursor Control Register 

31 24 23 



01011010 



Vertical Position 



0x5a 

o 



Horizontal Position 



— Horizontal Position in units of T-cycles 

— Vertical Position in units of full lines 

This register also gives the horizontal and vertical 
position of the cursor. The cursor will extend 16-pixel 
periods, starting at the prescribed horizontal posi- 
tion, for the next 16 lines. (Or 32-pixel periods for 32 
lines if the 2X Cursor Mode bits in the General Con- 
trol register are set to one.) Receipt of this address 
also causes the 82750DB to interpret the next six- 
teen 32-bit words of register data as the 1 6x16 x 
2-bit cursor map. This will cause the register address 
decoding logic internal to the 82750DB to be dis- 
abled, and the next 16 words of information will be 
loaded into the Cursor table. Each 32-bit word will be 
interpreted as a fine (16 pixels) of cursor data, with 
the two least significant bits corresponding to the 
first cursor pixel to be displayed. 



Cursor Color 3 

31 24 23 



010110 1 Blue/U Color Red/V Color Green/ Y Color 



0x59 

o 



If the cursor is enabled and the 24 bits of data in this 
register are selected, the data will be sent directly to 
the YU V conversion matrix during active display. The 
bits should be programmed as RGB values when the 
YUV to RGB matrix is not being used. 



Cursor Color 2 

31 24 23 



010110 Blue/U Color Red/V Color Green/Y Color 



0x58 

o 



If the cursor is enabled and the 24 bits of data in this 
register are selected, the data will be sent directly to 
the YUV conversion matrix during active display. The 
bits should be programmed as RGB values when the 
YUV to RGB matrix is not being used. 



Cursor Position Update Register 0x5b 


Cursor Colo 

31 24 


r 1 

23 16 


15 8 


0x57 

7 


31 24 


23 12 


11 


| 10 10 111 


Blue/U Color 


Red/V Color 


Green/Y Color 


I 


Vertical Position 


Horizontal Position | 




I 10 110 11 











— Horizontal Position in units of T-cycles 

— Vertical Position in units of full lines 

This register gives the horizontal and vertical posi- 
tion of the cursor. The cursor will extend 16-pixel 
periods, starting at the prescribed horizontal posi- 
tion, for the next 16 lines. (Or 32-pixel periods for 32 
lines if the 2X Cursor Mode bits in the General Con- 
trol register are set to one. 



If the cursor is enabled and the 24 bits of data in this 
register are selected, the data will be sent directly to 
the YUV conversion matrix during active display. The 
bits should be programmed as RGB values when the 
YUV to RGB matrix is not being used. 
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DISPLAY TIMING REGISTERS 

Each register has two, 12-bit components, listed 
with least significant bits first, followed by the 12 
most significant bits. Horizontal timing is measured 
in units of T-cycles (periods of the master clock) 
from the start of horizontal sync. The register con- 
tent defines the number of T-cycles that elapse be- 
fore the event controlled by this register takes place. 
The exception to this rule is the base counter, which 
specifies the number of T-cycles/half line. Zero is 
not an allowable value; use the total number of T-cy- 
cles per half line or full line instead. Unused bits 
should be zero. Sync signals are RESET to initial 
values as specified for each; "start" means to set to 
1, and "stop" means to be reset to zero. 



Base Counter 

31 24 23 



01010110 



# of Lines/Field 



0x56 

o 



#ofT-Cycles/Half Lines 



— T-cycles/Hal Line in units of T-cycles (Periods of the 
master Clock) 

— Half Lines/Field in units of half lines 

As defined by NTSC standards, vertical timing can 
be measured from the start of a field in one of two 
ways: either in units of half lines, or in units of full 
lines. When programmed for an interlaced display, 
(i.e. an odd number of half lines per field) the start of 
a field coincides with the start of a line on odd fields 
and with the midpoint of a line on even fields. In the 
latter case, for an event that is programmed in full 
lines, the first half line is ignored, and counting be- 
gins with the first full line. With this interpretation, the 
register content defines the number of half or full 
lines that elapse before the event controlled by this 
register takes place. The same may be said for the 
horizontal component, which is defined by the num- 
ber of T-cycles/half line. The hardware does not 
look for nor correct illogical combinations of register 
settings. The monitor should be protected from dam- 
age with external circuitry when debugging is in 
progress. 

All of the internal timing is derived from comparing 
the programmed values with the values of this regis- 
ter. The horizontal base counter is programmed us- 
ing the least significant 12 bits. In this case the val- 
ues loaded into this register should be one less than 
the desired value. Bits 23 through 12 are used to 
specify the number of half lines per field. 



Sync Stops 




0x55 


31 24 


23 12 


11 


| 01010101 


VSYNC Stop 


HSYNC Stop | 



- HSYNC Stop in units of T-cycles 

- VSYNC Stop in units of half lines 



Sync Starts 

31 24 23 



01010100 



0x54 

12 11 . 



HSYNC Start 



— HSYNC Start in units of T-cycles 

— VSYNC Start in units of half lines 

The Sync Stops and Sync Starts registers are used 
in conjunction with one another to specify the start 
and stop locations of the horizontal sync, HSYNC, 
and vertical sync, VSYNC, output signals. VSYNC 
may be programmed to start and stop at any time 
during a given field as defined on a half-line interval. 
Bits 23 through 12 in the Sync Starts and Sync 
Stops registers are used to define the start and stop 
times for VSYNC, respectively. Similarly, HSYNC 
may be programmed to start and stop at any line 
position as defined in units of T-cycles. Bits 11 
through in the Sync Starts and Sync Stops regis- 
ters are used to define the start and stop positions 
for HSYNC, respectively. 

The horizontal component of the Sync Stops register 
also affects the composite sync, of CSYNC output. In 
this case, the CSYNC output will be the same as the 
HSYNC output, except during the vertical sync and 
equalization interval. In the latter case, the CSYNC 
output is determined by the Serration and Equaliza- 
tion registers. 




Blanking Stops 



0x53 



31 



24 



01010011 



23 



12 



Vertical Blank Stop 



11 



Horizontal Blank Stop 



— HB Stop in units of T-cycles 

— VB Stop in units of half lines 

The Blanking Start and Stop registers control the 
composite blanking output (CB). The horizontal 
blanking start and stop position, in units of T-cycles, 
can be specified to occur at any time during the line. 
By the same token, the vertical blanking start and 
stop positions can be programmed to occur at any 
half-line interval. 
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The CB output combines both the horizontal and 
vertical blanking pulses programmed using these 
two registers. This information is independent from 
the HSYNC, VSYNC, and CSYNC outputs, so the 
user must specify the proper blanking intervals for 
the monitor that is being used. If the programmer 
specifies the blanking period to end before the ac- 
tive line starts, or start after the active line has end- 
ed, the border color is output. Due to internal pipe- 
line delays on the 82750DB, the values should be 
one less than desired for VB Start and Stop. For HB 
Start and Stop subtract the total horizontal pipeline 
delay. 



Serration Start 



0x51 



24 23 



01010001 



Not Used 



Serration Start 



Blanking Starts 

31 24 


23 12 


0x52 

11 o 


| 01010010 


Vertical Blank Start 


Horizontal Blank Start | 


— HB Start in units of T-cycles Resets to 1 

— VB Start in units of half lines Resets to 1 



Program values one less than desired for VB Start 
and Stop. For horizontal blanking start, load num- 
bers less than the total horizontal pipeline delay. 



■— SER Start in units of T-cycles Resets to 
— (not used) 

The vertical component of the CSYNC (composite 
sync) signal is made up of two types of pulses: 
equalization and serration pulses. The window dur- 
ing which the serration pulses are active, is deter- 
mined by the VSYNC start and stop positions, as 
shown in Figure 4-1. When vertical sync (VSYNC) is 
active, in this case on line 3, the first serration pulse 
is output on the CSYNC signal. This pulse will start 
at the T-cycle count specified in Bits 11 to of the 
Serration Start register. The pulse will end when the 
half-line count specified in the Base Counter register 
has been reached. This pulse will be repeated for 
every half line that the VSYNC output is pro- 
grammed to be active, regardless of the position in 
the field. In Figure 4-1 , this continues until half line 
12, or line 6. 
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Figure 4-1. Programming the Video Sync Outputs 
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Equalization Parameters 

31 24 23 ' 12 11 



10 10 Vertical Equalization Stop Horizontal Equalization Stop 



0x50 

o 



— EQH Stop in units of T-cycles 

— EQV Stop in units of half lines 



Resets to 1 
Resets to 1 



During the vertical equalizing period, which starts at 
field-beginning, an equalization, pulse is output on 
the CSYNC signal at the beginning of each half line, 
as shown in Figure 4-1 . The width of this equaliza- 
tion pulse is determined by the value in bits 1 1 to 
of this register. The half line on which these pulses 
are to stop is programmed in bits 23 through 12 of 
this register. If VSYNC is programmed to occur dur- 
ing the equalization interval (as it is for NTSC type 
displays), the serration pulses are output on the 
CSYNC signal. 



Active Region Stops 



0x4f 



01001111 



12 11 



Vertical Active Stop Horizontal Active Stop 



— Actdis Stop in units of T-cycles 

— Vertical Stop in units. of full lines 

The active region window, during which pixels to be 
displayed are fetched from VRAM, is defined by the 
Active Region Start and Stop registers. The first dis- 
play line is actually five lines after the line indicated 
in the vertical region of the Active Region Start regis- 
ter. The position of the active region on a horizontal 
line is determined by the horizontal component of 
the Active Region Start register. Pixels will be 
fetched from VRAM at a rate determined by the 
number of bits/pixel and pixel widths. In order for the 
82750DB to operate properly, the horizontal width of 
the active region window must be an integral number 
of display pixel widths, taking into account the hori- 
zontal pipeline delay. Also, the Active Region Start 
and Stop must fall within a single line boundary, as 
dictated by the Base Counter register. When. the first 
pixel actually appears at the output of the 82750DB, 
the output is a function- of the processing elements 
used as discussed above. 

When the active region is over, the border color is 
output until the programmed blanking time is 
reached. Both the border and blanking information is 
output at the transfer rate programmed by the user. 



Active Region Starts 

31 24 23 



01 0011 1 



12 11 



0x4e 

o 



Vertical Active Start Horizontal Active Start 



- Actdis Start in units of T-cycles 

- Vertical Start in units of full lines 



Burst Gate Stop 

31 24 23 



01001101 



Vertical BG Stop 



0x4d 

o 



Horizontal BG Stop 



— Horizontal Stop Position in units of T-cycles 

— Vertical Stop Position in units of full lines 

The Burst Gate Horizontal and Vertical Start and 
Stop registers allow the user to program a window 
into which burst can be added. This is useful when 
modulating the outputs of the 82750DB. 




Burst Gate Start 

31 24 23 



10 110 



Vertical BG Start 



0x4c 





Horizontal BG Start 



— Horizontal Start Position in units of T-cycles 

— Vertical Start Position in units of full lines 



VBUS CODE REGISTERS 

The following group of registers are used by the pro- 
grammer to schedule when VBUS transfer or control 
codes are to be sent to the 82750PB by the 
82750DB. 



Display Format Load Interrupt 



0x4b 



01001011 



Vertical DFL Position Horizontal DFL Position^ 



— Horizontal Position in units of T-cycles 

— Vertical Position in units of full lines 

This is the programmable XY interrupt, used by the 
82750PB to perform a load of the Shadow Copy reg- 
isters. This interrupt is sent on the VBUS when the 
bits 23 to 12 match the current display line position, 
and bits 11 to match the T-cycle count. 
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Line Notification Timing 



01001010 



Not Used 



0x4a 

o 



Alpha Register 

31 24 23 



Horizontal HLIN Position | | 01 01 11 Border Alpha Alphal Register AlphaO Register 



0x47 

o 



— HLIN timing in units of T-cycles 

— Not Used 

This indicates the position on each line to send a 
HLINE code on the VBUS. The 82750PB requires 
this information to keep track of the current display 
line when drawing graphics. 



The least significant 8 bits are for the ALPHAO regis- 
ter and are used during blanking and if the alpha trap 
value is not matched. The next 8 bits are for the 
ALPHA1 register when the alpha trap value is 
matched. The most significant 8 bits provide the al- 
pha channel value during the border time. 



Refresh and Register Transfer 

31 24. 23 12 11 



01001001 



REGX Line Number 



0x49 

o 



Refresh Horizontal Position 



— REFRESH horizontal timing in units of T-cycles 

— Register Transfer Line number in units of full lines 

When the T-cycle count matches the value pro- 
grammed into bit 11 to of this register, a refresh 
code is sent to the 82750PB. Since these codes tie 
up the 82750PB for at least eight 82750PB cycles, 
the programmer must ensure that no transfer re- 
quests are scheduled to occur during this time. 

The line number for the next register transfer is 
specified in bits 23 to 12 of this register. If pro- 
grammed to occur, REGX will always be the first 
transfer request sent to the 82750PB, immediately 
after the end of active display. 

COLOR REGISTERS 

The following registers specify the state of DBU, 
DRV, DGY, and ALPHA signals during the field. 



Border Color 

31 24 23 



010 10 Blue/U Color Red/V Color Green/Y Color 



0x48 

o 



The 24 bits of data in this register are sent directly to 
the YUV conversion matrix during border time. Bor- 
der time is defined as the region in which neither 
active display nor blanking is programmed to occur. 
The bits should be programmed as RGB values 
when the YUV to RGB matrix is not being used. 



Blanking Color 

31 24 23 



16 15 



010 0110 Blue/U Color Red/V Color Green/Y Color 



0x46 

o 



The 24 bits of data in this register are sent directly 
through the YUV conversion matrix during the pro- 
grammed blanking time. 

CONTROL REGISTERS 

The following registers are used to define the oper- 
ating modes of the 82750DB. 



Pixel Control 



0x45 



23 


22 


21 


19 


18 


14 


13 


11 


10 


9 8 


7 6 














1 1 1 



1 * 

i SCLK 



Bits/Pixel 
U VU Pixel Replicate 
Pseudo 16-Bit Mode 



SCLK Delay 
VU Interpolation Round 
' r Conditional Interpolation Enable 
w VU Interlace Enable 
4xVU Expand 
VU Sample Select 
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Bits 6:0— SCLK Delay 

The number "m" of T-cycles from initiation of a 
transfer request on the VBUS until the first SCLK is 
asserted by the 82750DB. 



1-38 



82750DB 



Bit 7— VU Interpolation Round 

When equal to 0, this bit means truncate during in- 
terpolation. When set to one, this bit means round to 
odd during interpolation. 



Bit 8 — Conditional Interpolation Enable 

When reset to zero, this bit means all values of Y 
and U are a full 8 bits of precision. When set to one, 
this bit means the least bit of the Y sample or the U 
sample controls the switching between VU interpola- 
tion and graphics mode. 



Bit 9— VU Interlace Enable 

Setting this bit to a one causes the interpolator to 
output different data on the odd and even fields. 
During the odd field, the odd lines of the interpola- 
tion sequence will be output. During the even field, 
the even. lines of the interpolation sequence will be 
output. Full lines of the programmed number of sam- 
ples of both the V and U data will be read in during 
each VU transfer. Setting this bit to a zero will cause 
horizontally and vertically interpolated data to be 
output on both fields. Only a full line of either V or U 
samples will be read in during each transfer request 
in this mode. 



Bit 10—4X VU Expand 

When this bit is set to a zero, a 2X expansion in both 
directions is performed. By setting this bit to a one, a , 
4X expansion is performed. 



Bits 13:11 — VU Sample Select 

Table 4-1 provides the code and number of V and U 
samples for bits 13:11. 

Table 4-1. VU Sampling 



Bits 18:14— Pixel Time 

Table 4-2 lists the codes and pixel duration for bits 
18:14. 



Code 


Number of V And U Samples 


000 


Samples for Each V and U 


111 


32 Samples for Each V and U 


110 


64 Samples for Each of V and U 


101 


96 Samples for Each of V and U 


100 


1 28 Samples for Each of V and U 


011 - 


160 Samples for Each of V and U 


010 


1 92 Samples for Each of V and U 


001 


256 Samples for Each of V and U 



Table 4-2. Pixel Times 


Code 


Duration of Pixel 


00001 


1.0T-cycle 


00010 


1 .5 T-cycles 


00100 


2.0 T-cycles 


01000 


2.5 T-cycles 


10000 


3.0 T-cycles 


10001 


3.5 T-cycles 


10010 


4.0 T-cycles 


10100 


4.5 T-cycles 


11000 


5.0 T-cycles 


11001 


5.5 T-cycles 


11010 


6.0 T-cycles 


11100 


6.5 T-cycles 


11101 


7.0 T-cycles 


11110 


7.5 T-cycles 


00011 


8.0 T-cycles 


00101 


8.5 T-cycles 


00110 


9.0 T-cycles 


00111 


9.5 T-cycles 


01001 


10.0 T-cycles 


01010 


10.5 T-cycles 


01011 


11.0 T-cycles 


01100 


11.5 T-cycles 


01101 


12.0 T-cycles 


01110 


13.0 T-cycles 


01111 


14.0 T-cycles 
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Bits 21:19— Bits/Pixel 

Table 4-3 provides the code and number of bits/pix- 
el for bits 21:19. 

Table 4-3. Number of Bits/Pixel 



Code 


Number of Bits/Pixel 


001 


8 


010 


16 


100 


32 



Bit 22— VU Pixel Replicate 

When set to one, each pixel generated by the VU 
Interpolator is held for 2-pixel times. This allows an 
effective 8X expansion of VU data. This is useful for 
high resolution applications where the blanking time 
is not sufficient to support higher VU sample loads. 



Bit 23— Pseudo 16-Bit Mode 

When set to one and 16 bits per pixel is chosen (bits 
21:19), the 82750DB is in the 16-bit with Alpha 
mode. Setting this signal to zero while in the 16-bit/ 
pixel mode puts the 82750DB into the 16-bit (655) 
mode. This bit represents a "don't care" input for all 
other values of bit/pixel. 



Bit 6— 2X Horizontal Cursor 

When this bit is set to one, and the Cursor Enable bit 
is set to one, every pixel on each line of the cursor 
will be replicated once. Thus a cursor that was 
16x16 pixels will become 32 x 16 pixels. 



Bit 7— 2X Vertical Cursor 

When this bit is set to one, and the Cursor Enable bit 
is set to one, each line of the cursor will be replicat- 
ed once. Thus a cursor that was 16x16 pixels will 
become a 1 6 x 32-pixel cursor. 

Bit 9:8— Channel Select 

These two bits control which output channel is 
muxed onto the alpha digital outputs. It allows Y, U, 
or V data to be available at the alpha channel. The 
coding is provided in Table 4-4. 

Table 4-4. Test Mode Select Coding 



Code 


Alpha Channel Output 


00 


Alpha Channel 


01 


Y Channel 


10 


V Channel 


11 


U Channel 



General Control 



Reserved - Set To Zero 



I 1 

Y Cun 



0x44 



23 


17 


16 


13 


12 


11 


10 


9 


8 


7 


5 


5 4 

















I I I 



Gren 
Viden 



I 

Burst Multiple 
Cursor Enable 
2x Horizontal Cursor 
2x Vertical Cursor 
Channel Test Select 
Sync Test 
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Bit 10— Sync Test 

This bit must be set to zero for proper operation. 

Bit 11— Gren 

This is the Graphics Enable bit for the Y Interpolator. 
When this bit is set to one and the pixel is a graphics 
pixel, switch is zero, a 2X interpolation will be per- 
formed on the pixel. 



Bits 4:0— Burst Multiple 

These bits are used to program a divisor of the 
FREQIN clock input in order to recover the 
3.58 MHz NTSC color subcarrier. The programmed 
value is the two's complement of the desired divisor. 
The allowed range of values is 00001 through 11111 
which corresponds to divisions of 31 through 1. Note 
that the 82750DB must be operating at an integer 
multiple of 3.58 MHz for this to work effectively. 



Bit 5— Cursor Enable 

When set to one, the hardware cursor will output the 
cursor data at prescribed intervals if programmed to 
do so. 



Bit 12— Viden 

This is the Video Enable bit of the Y Interpolator. 
When this bit is set to one and the pixel is a video 
pixel, switch is one, a 2X interpolation will be per- 
formed on the pixel. 

Bit 16:13— Vblen 

These bits program the T-cycle length of each VBUS 
code. The VBUS code length will be one T-cycle 
longer than the programmed value. These bits must 
have a minimum value of 2, and a maximum value of 
15. 
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Miscellaneous Control 



23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 



Reserved 
(write as zero) 



Line 

Replicate 

Enable^ t 

82750DB 
Mode 
Enable 



Transfer Timing 
Select ^ 



I Alpr 



Alpha Trap Value 
Alpha Trap Select 
>jf Border Alpha Enable 
y Digitize Enable 
if VU Interpolator Output Enable 
1 1 Alpha Enable 
\|r Switch on LSB Of Y 
if Gonlock Enable 
' ' Bypass Conversion Matrix 
< ' Split CLUT 
Graphics Pass 
3 Pass 
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0x43 Bit 12— Alpha Enable 

o When set to one, the alpha output is governed by 

[ the alpha trap value, as described above. When re- 

set to zero, the contents of the ALPHAO register is 
the alpha output in the 8- and 16-bit modes, and the 
explicit ALPHA data encoded in the pseudo 16- and 
32-bit modes. 



Bits 7:0— Alpha Trap 

Bits 7:0 are 8-bit values used for comparison with 
the current pixel's Y value, to select one of two pro- 
grammable alpha values. 



Bit 8— Alpha Trap Select 

A value of one enables the Y value of the current 
pixel to be compared with the value in the Alpha 
Trap register. If the two values match and Alpha has 
been enabled via the Alpha Enable bit, the contents 
of the ALPHA1 register are output on ALPHA[7:0]. If 
the two values don't match and Alpha Enable has 
been set to one, the content of the ALPHAO register 
is output. When Alpha Trap Select is set to a zero in 
the pseudo 16- or 32-bit mode, the most significant 
byte of the pixel word is output. When Alpha Trap 
Select is set to zero in all other modes, the value of 
the ALPHAO register is output. 



Bit 9— Border Alpha Enable 

A value of one enables the eight most significant bits 
in the ALPHA register to be output. When set to a 
zero, the ALPHAO register is output during border 
time. 



Bit 10— Digitize Enable 

When this bit is set to a one, the FCO signal will be 
set to a one, and the transfer codes for bitmaps will 
indicate that write operations should occur. 



Bit 1 1— VU Interpolator Output Enable 

This bit enables VU interpolation data to be dis- 
played. When set to a zero, all pixels are treated as 
graphic pixels. 



Bit 13— Switch on LS Bit of Y 

When set to one, the least significant bit of Y is used 
as a Video/Graphics switch in all modes. When re- 
set to zero, the least significant bit of U from the 
interpolator acts as a switch. 



Bit 14— Genlock Enable 

This bit enables the genlock mode of the 82750DB. 
In this mode, receipt of the external HRESET# sig- 
nal during the second half of a scan line will cause 
the termination of that scan line. Similarly, receipt of 
the externally produced VRESET# signal will termi- 
nate the field. In both cases, terminate denotes that 
the proper on-chip signals are produced to signify 
end of the line and end of the field. 



Bit 15— Bypass Conversion Matrix 

When this bit is set to a one the YUV to RGB matrix 
will be bypassed, and the Y, U, and V data will feed 
directly into the Digital to Analog Converters. 



Bit 16— Split CLUT 

This bit divides the CLUT into an odd and an even 
half, depending on the polarity of the Video/Graph- 
ics switch. This switch is selectable and may be ei- 
ther the LSB of U from the interpolator or Y from the 
pixel word. The LSB of the CLUT address is set to 
one (odd address) if the Video/Graphics switch is 
one; the LSB of the CLUT address is set to zero 
(even address) if the Video/Graphics switch is zero. 

Bit 17— Graphics Pass 

Setting this bit to a one bypasses the CLUT for 
graphics pixels, even in non-mixed modes. 

Bit 18— Video Pass 

When set to a one, all video pixels (luminance val- 
ues associated with sub-sampled UV values) will by- 
pass the color table. For mixed modes, this corre- 
sponds to the switch flag having a value of one. 
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Bit 20:19— Transfer Timing Select 

These bits are two-bit codes that select one of three 
possible transfer shift clock rates. This allows the 
operating speed of the 82750DB to be tailored to the 
external memory access time. After RESET, the 
transfer rate is set to the slowest possible clock rate 
(1/3X). The programmed rate is used during all non- 
active display times for transferring data from 
VRAMs. It also defines the rate that the border and 
blanking data is output. During active display, the 
data is read as needed from VRAM using the pro- 
grammed timing. The coding of these bits is listed in 
Table 4-5. 

Table 4-5. Coding of Transfer Timing Select Bits 



Bit 20 


Bit 19 


Result 








1/3X Transfer (Default) 


o 


• ■>■■ 1---'~>. 


1/2X Transfer 


1 





1.X Transfer 



Bit 21— 82750DB Enable 

When set to zero, the 82750DB will be the register 
equivalent of a 82750DA. When set to a one all the 
features of the 82750DB will be enabled. 



Bit 22— Line Replicate Enable 

When this bit is set to one, every line in the. active 
display is generated twice. Each new bitmap transfer 
occurs at half the line rate, with a new VBUS code 
being used to indicate that a transfer is to take place 
without the pitch calculation. The VU Interpolator will 
also duplicate the lines it generates, yielding more 
time between transfer cycles. This modeis useful for 
obtaining a 2X increase in vertical resolution without 
the need for increasing the VRAM transfer band- 
width. 



COLOR MAP REGISTERS 

The following registers are used to access and con- 
trol the three 256 x 8-bit Color Lookup Tables. 



Mask Data Registers 



0x42 



10 010 Blue/U Mask Data Red/V Mask Data Green/Y Mask Data 



Each of the three 8-bit registers contains the bit pat- 
tern used when the corresponding bit in the Mask 
Set register is asserted. 



Mask Set Registers 

31 24 23 16 15 



10 1 Blue/U Color Red/V Color Green/Y Color 



0X41 

o 



This is a 24-bit register, that contains the mask bit 
pattern for the RGB/YUV color map addresses. 
When a bit in this register is asserted, the corre- 
sponding bit in the address is set to the value de- 
fined in the Mask Data registers. 



GLUT Index Register 



0x40 



01 000000 



Not Used 



Not Used 



YUVCLUTIndert 



The CLUT Index register is an 8-bit register used for 
loading the color tables. This register maps the user- 
specified 6-bit color map address into an 8-bit ad- 
dress. A logical OR operation is performed between 
the 6-bit address and the 8-bit index word to obtain 
the new CLUT address. 



Color Lookup Table Addresses 



0x00-0x3f 



If the 82750DB, Enable mode bit in the Miscellane- 
ous Control register is set to zero, the CLUT ad- 
dresses are decoded to appear as addresses to the 
reduced-size 82750DA color table. The least signifi- 
cant four bits of the address are used for the Y color 
table address, and the upper nibble is used to ad- 
dress the V and U color table simultaneously. This is 
a compatibility mode for the 82750DA, which has a 
reduced-size color table. 



31 28 


27 24 


23 16 


15 8 


7 


| UV Address 


Y Address 


UData 


VData 


YData | 



If the 82750DB Enable mode bit is set to one, the full 
color table is used. In this case, the most significant 
byte of the 32-bit data word is used as an address to 
the color table. The address is ORed with the most 
recently loaded CLUT Index register. 



31 30 


29 24 


23 16 


.15 8 


7 


| 


YUV Address 


UData 


VData 


YData | 
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82750DB Register Summary 

The following table illustrates the register space of the 82750DB. 

Table 4-6. 82750DB Register Space 



Address 


82750DB Register 


0x00 -OxOf 


CLUT Locations 0-15 


0x10-0x30 


CLUT Locations 16-48 


0x31 


CLUT Location 49 


0x32 


CLUT Location 50 


0x33 


CLUT Location 51 


0x34 


CLUT Location 52 


0x35-0x37 


CLUT Location 53-55 


0x38 


CLUT Location 56 


0x39-0x3f 


CLUT Location 57-63 


0x40 


CLUT Index Register 


0x41 


CLUT Mask Set Register 


0x42 


CLUT Mask Data Register 


0x43 


Miscellaneous Control 


0x44 


General Control 


0x45 


Pixel Control 


0x46 


Blanking Color 


0x47 


Alpha Register 


0x48 


Border Color 


0x49 


Register Transfer 


0x4a 


Line Notification and Timing 


0x4b 


DFLLoad 


0x4c 


Burst Gate Start 


0x4d 


Burst Gate Stop 


0x4e 


Active Region Start 


0x4f 


Active Region Stop 


0x50 


Equalization Parameters 


0x51 


Serration Start 


0x52 


Blanking Start 



Address 


82750DB Register 


0x53 


Blanking Stop 


0x54 


Sync Start 


0x55 


Sync Stop 


0x56 


Base Counters 


0x57 


Cursor Color 1 


0x58 


Cursor Color 2 


0x59 


Cursor Color 3 


0x5a 


Cursor Control 


0x5b 


Not Used 


0x5c 


Not Used 


0x5d 


Not Used 


0x5e 


Not Used 


0x5f 


Not Used 


0x60 


Not Used 


0x61 


Not Used 


0x62 


Not Used 


0x63 


Not Used 


0x64 


Not Used 


0x65 


Not Used 


0x66 


Not Used 


0x67 


Not Used 


0x68 


Not Used 


0x69 -0x6e 


Not Used 


0x6f 


Not Used 


0x70 


Not Used 


0x71 -0x7f 


Not Used 


0x80 -Oxfe 


Not Used 


Oxff 


Stop Code 
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5.0 ELECTRICAL DATA 
Maximum Ratings 

Table 5-1 is a stress rating only, and functional operation 
at the maximums is not guaranteed. Functional operat- 
ing conditions are given in the DC and AG Characteris- 
tics (Tables 5-2, 5-3, 5-4, and 5-5). 



Exposure to the Maximum Ratings may affect device 
reliability. Furthermore, although the 82750DB con- 
tains protective circuitry to resist damage from static 
electrical discharge, always take precautions to 
avoid high static voltages or electric fields. 



Table 5-1. Absolute Maximum Requirements 



Condition 


Maximum 
Requirement 


Case Temperature under Bias 


-65°Cto110°C 


Storage Temperature 


-65 o Cto110°C 


Voltage on Any Pin with Respect to Ground 


-0.5VtoVcc + 0.5V 


Supply Voltage with Respect to Vss 


-0.5V to + 6.5V 



DC Characteristics 



Table 5-2. DC Characteristics V r 



cc 



5V ±10%, T r 



CASE 



= 0°Cto95°C 



Symbol 


Parameter 


Min 


Typ 


Max 


Unit 


Notes 


V|L 


Input LOW Voltage 


-0.3 




,. : .b:8 


V 




V|H 


Input HIGH Voltage 


2.0 




Vcq + 0.3 


V 




Vol 


Output LOW Voltage 




IE# !; " 


0.4 


V 


l OL = 4.0mA (1) 


Vqh 


Output HIGH Voltage 


- 2A *d 


v3.'0.:V 




V 


l 0H = - 1 .0 mA (1) 


'IL 


Input Leakage Current 


-10 




+10 


uA 


V SS <V IN <V CC 


'oz 


Output Leakage Current :; 


-10 




+10 


HA 


V SS <V IN <V CC 


'CCT 


Power Supply Current 




rt%5 


250 


mA 


28MHz< 2) 


'CCNT 


Power Supply Current c 




140 


190 


mA 


28MHz (3) 


'CCT 


Power Supply Current 




280 


375 


mA 


45 MHz (2) 


'cCNT 


Power Supply Current V 




215 


285 


mA 


45MHz (3 > 


C IN 


Input Capacitance 






10.0 


pF 


F c = 1 MHz (4) 


CoUT 


Output Capacitance 






12.0 


PF 


,F C = 1 MHz< 4 > 


^FREQIN 


FREQIN Input Capacitance 






20.0 


pF 


F c =1 MHz (4) 



NOTES: 

1 . Measured with FREQIN = 7 MHz. 

2. Typical current value measured under typical conditions with the Digital Outputs (DGY, DRV, and DBU) toggling. Maximum 
current value guaranteed with 50 pF maximum output loading. Analog Outputs disabled. 

3. Typical current value measured under typical conditions with the Digital Outputs (DGY, DRV, and DBU) not toggling. 
Maximum current value guaranteed with 50 pF maximum output loading. Analog Supply Current IACC not included. 

4. Not 100% tested. 
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AC Characteristics 

Table 5-3. AC Characteristics at 28 MHz 



V cc = 5V ±10%, T CASE = 0°C to 95°C, C L = 50 pF 



Symbol 


Parameter 


Min 


Max 


Unit 


Figure 


Notes 




Frequency 


7 


28 


MHz 




IXCIock 


*1 


FREQIN Period 


35 


140 


ns 


5-1 




t 2 


FREQIN High Time 


12 


23 


ns 


5-1 


(Notel) 


*3 


FREQIN Low Time 


12 


23 


ns 


5-1 


(Note 1 ) 


u 


FREQIN Fall Time 




4 


ns 


5-1 




*5 


FREQIN Rise Time 




4 


ns 


5-1 




tea 


HSYNC, VSYNC, CSYNC, BG, 
FCO Valid Delay 




24 


ns 


5-2 




*6b 


VBUS[3:0] Valid Delay 




26 


ns 


5-2 




t 7 


RESETB#, VRESET#, HRESET#, 
DISDIG, TESTACT Setup 







ns 


5-3 




«8 


RESET #, VRESET#, HRESET#, 
DISDIG, TESTACT Hold 


13 




ns 


5-3 




*9 


SCLK[1:0] Valid Delay High 




14 


lis*' 


,..- : 5'4. 


IXMode 


tio 


SCLK[1:0] Valid Delay Low 




1/2^1.4;; 


;: n .s -f 


••;;,;"• •' : 5-4 


IX Mode 


*11 


SCLK[1:0] Valid Delay 




'•■'-5M"4' ;; r;> 


^l#- 


5-5, 5-6 


1/2X, 1/3XMode 


*12 


DATAIN[31:0] Setup 


-iX 1 * 




ns 


5-4, 5-5, 5-6 




t i3 


DATAIN[31:0]Hold 


5 , 4 




|"%s 


5-4, 5-5, 5-6 




tl4 


PIXCLK Valid Delay 




Hi^i^b 


ns 


5-7 


(Note 2) 


tl5 


PIXCLK Valid Delay 




.... 20 


ns 


5-7 


(Note 3) 


tie 


DRV[7:0], DGY[^^BU[^^> ' 
ALPHA[7:Oj; ACTDIS, CB, BPP[G], 
BPP[1} Output Setup ■■;,"■ 


■■■'"a '•'•' 




ns 


5-8 




tl7 


DRV{7:0], DGY[7;0], : DBU[7:S], 
ALPHA[7:G]i ACtDIS, CB, BPP[0], 
BPP[1] Output Hold 


15 




ns 


5-8 




tie 


VBUS[3.0], SCLK[1 .0], FCO, 
HSYNC, VSYNC, CSYNC, CB, BG, 
PIXCLK, DRV[7:0], DGY[7:0], 
DBU[7:0], ALPHA[7:0], ACTDIS, 
BPP[0],BPP[1] Float Delay 




30 


ns 


5-9 


(Note 4) 


*19 


DISDIG, DRV[7:0], DGY[7:0], 
DBU[7:0], Digital Output 
Disable Delay 


3ti. 




ns 


5-10 




l 20 


DISDIG, DRV[7:0], DGY[7:0], 
DBU[7:0], Digital Output 
Enable Delay 


3t 1 




ns 


5-10 




t 2 1 


DISDAC, RV, GY, BU Analog 
Output Disable Delay 




19 


ns 


5-11 


(Note 6) 


t 22 


DISDAC, RV, GY, BU Analog 
Output Enable Delay 




19 


ns 


. 5-11 


(Note 6) 
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NOTES: 

1. This assumes a 35 ns period. For other speeds, the FREQIN High and Low Times should fair within a 40% to 60% duty 
cycle. 

2. For integer pixel times t 14 is the Valid Delay on all assertions of PIXCLK during active display time. 

3. For non-integer pixel times t 15 is the Valid Delay on alternating assertions of PIXCLK during active display time. 

4. Not 100% tested. 

5. All A.C. specifications are measured at the 1 .5V crossing point with a 50 pF load. 

6. Analog output delay is measured at the 50% level of the full scale transition with R L ± 75Q and C L = 25 pF. 



AC Characteristics 

Table 5-4. AC Characteristics at 45 MHz 



V cc = 5V ±10%, T CASE = 0°C to 95°C, C L = 50 pF 



Symbol 


Parameter 


Min 


Max 


Unit 


Figure 


Notes 




Frequency 


7 


45 


MHz 




1 XCIock 


V 


FREQIN Period 


22 


140 


ns 


5-1 




«2 


FREQIN High Time 


7 


15 


ns 


5-1 


(Notel) 


*3 


FREQIN Low Time 


7 


15 


ns 


5-1 


(Notel) 


V 


FREQIN Fall Time 




4 


Js$$k 


5-1 




*5 


FREQIN Rise Time 




4 1 


J|ns \ 


5-1 




tea 


HSYNC, VSYNC, CSYNC, BG, - 
FCO Valid Delay 




20 


ns 


5-2 




*6b 


VBUS[3:0] Valid Delay 




JS? 


^1 


5-2 




V 


RESETB#, VRESET#, HRESET#, 
DISDIG, TESTACT Setup 







( : ns 


5-3 




V 


RESET B#, VRESET#, HRESET#, 
DISDIG, TESTACT Hold 


1 Ue 




Jhs 


5-3 




«9 


SCLK[1:0] Valid Delay High 




m^£^' 


ns 


5-4 


IXMode 


Ho 


SCLKp :0] Valid Delay Low 




;' t/2t^:12 


ns 


5-4 


1X Mode 


*11 


SCLK[1:0] Valid Delay 




12 


ns 


5-5, 5-6 


1/2X, 1/3X Mode 


*12 


DATAIN[31:0] Setup 


v /j 




ns 


5-4,5-5,5-6 




*13- 


DATAIN[31:0]Hold 


fjAi 




ns 


5-4, 5-5, 5-6 




*14 


PIXCLK Valid Delay 




1/2^+20 


ns 


5-7 


(Note 2) 


l 15 


PIXCLK Valid Delay 




20 


ns 


5-7 


(Note 3) 


*16 


DRV[7:0], DGY[7;0k:0BU{*Q], 
ALPHA[7:0], ACTDIS, CBv BPP[0], 
BPP[1 ]A/UGROutput B0ip 







ns 


.5-8 




*17 


DRV[7:0], DGY[7:0], DKi[7:0], 
ALPHA[7:0], ACTDIS, CB, BPP[0], 
BPP[1]/VUGR Output Hold 


10 




ns 


5-8 




tl8 


VBUS[3.0], SCLK[1.0], FCO, 
HSYNC, VSYNC, DRV[7:0], 
DGY[7:0], ALPHA[7:0], ACTDIS, 
BPP[0], BPP[1]/VUGR Float Delay 




30 


ns 


5-9 


(Note 4) 


*19 


DISDIG, DRV[7:0], DGY[7:0], 
DBU[7:0], Digital Output 
Disable Delay 


3*1 




ns 


"5-10, ; 
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AC Characteristics (Continued) 

Table 5-4. AC Characteristics at 45 MHz V cc = 5V ± 10%, T CASE = 0°C to 95°C, C L = 50 pF 



Symbol 


Parameter 


Min 


Max, ; j0 


Unit 


Figure 


Notes 


*20 


DISDIG, DRV[7:0], DGY[7:0], 
DBU[7:0], Digital Output 
Enable Delay 


3t, 




j|f1S 


5-10 




t 2 i 


DISDAC, RV, GY, BU Analog 
Output Disable Delay 




•,: 19 '- •••. 


ns 


5-11 


(Note 6) 


t 22 


DISDAC, RV, GY, BU Analog % p- 
Output Enable Delay 




.m;::; : *19- 


ns 


5-11 


(Note 6) 



NOTES: 

1. This assumes a 22 ns period. Fafibther speeds, the FREQIN High and Low Times should fall within a 40% to 60% duty 
cycle. 

2. For integer pixel times t 14 is the Validielay on all assertions of PIXCLK during active display time. 

3. For non-integer pixel times t 15 is the Valid Delay on alternating assertion's of PIXCLK during active display time. 

4. Not 100% tested. 

5. All A.C. specifications are measured at the 1 .5V crossing point with a 50 pF load. 

6. Analog output delay is measured at the 50% level of the full scale transition with R L = 75Q and C L = 25 pF. 
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Figure 5-1. Clock Waveforms 
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Figure 5-2. Output Waveforms 
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Figure 5-3. Input Waveforms 
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SCLK[1:0] 



DATAIN[31:0] 



240855-23 



Figure 5-4. 1XSCLK Mode 
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XDCZ 


240855-24 



Figure 5-5. 1/2X SCLK Mode 
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Figure 5-6. 1/3X SCLK Mode 
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Figure 5-7. PIXCLK Waveforms 
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Figure 5-8. Output Setup and Hold 
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Figure 5-9. TEST ACT # Float Delay 
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Figure 5-10. DISDIG to Digital Output Delay 



DISDAC 



RV 
GY 
BU 



■ 1.5V 



\ 



d^m: 



INDICATES HIGH-IMPEDANCE STATE 



Figure 5-11. DISDAC to Analog Output Delay 



Digital to Analog Converter Electrical Characteristics 

Table 5-5. DAC D.C Characteristics AV CC = 5V ±10%; T CASBfe = 0°C to +95°C 



Symbol 


Parameter 


Min 


Typ 


Max 


Unit 


Notes 


Iref 


Reference 
Current 






150£j 


^A 




Ifs 


Output Current* 
(Full Scale) 


0.93* (255/18.5)* Iref 




1.Q7.*(255/18,5)*lref 


mA 


(Notel) 


Vfs 


Output Voltage 
(Full Scale) 




LOT 


,1.5 


V 




INL 


Integral 
Nonlinearity 




: 1.0 


. : : ; ±3 


LSB 




DNL 


Differential 
Nonlinearity 






±1 


LSB 




IACC 


Analog Supply 
Current 






3 * Ifs + 8 


mA 


(Note 2) 


DDTR 


DAC to DAC 
Tracking at Full 
Scale 




2.0 


5.0 


o/ 
/o 


(Note 3) 


Cout 


Output 
Capacitance- 






12 


PF 


(Note 4) 



NOTES: 

1 . Maximum Ifs allowed = 22 mA. 

2. Maximum IACC allowed = 74 mA. Typical value of IACC = 3 * Ifs + 6 

3. Maximum deviation between RV, GY and BU outputs at fullscale output voltage. 

4. Not 100% tested. 

5. All DAC testing done with Iref = 1 500 \xA. 1 -50 



iny. 
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Table 5-6. DAC A.C. Characteristics 


Symbol 


Parameter 


Min 


Typ 


l\Ag* 


Unit 


Notes 


tr, tf 


Rise/Fall Time 






4#1o, :x 


ns 


(Notel) 


ClkF 


Clock Feedthrough 




-28 jJ 




dB 


(Note 2) 


GlEn 


Glitch Energy 




W9? r 




pV-sec 


(Notes 2, 3) 


Skew 


Output Skew 






'5# 3 


ns 




Xtlk 


Crosstalk 




200 




pV-sec 


(Note 2) 



NOTES: 

1 . Maximum value is for R L = 75Q and C L = 2&0. Qe|ie1|f yi0% to 90% fullscale transmission. 

2. Assumes an 80 MHz filter on output. : -\". s " 

3. Glitch energy generated from the inft.(|ence t|at 1 active outputs have on an idle output. 

4. DISDIG must be tied high. > " 

5. Assumes the use of 0.1 uF capacitor between VGCS and AV CC and 0.1 uF and 10 uF capacitors between IREFIN and AV CC 





R, = 75ft 

R L = Load Resistance 

C 2 =10nF 

C L = Load Capacitance 



10.5 
Vf8 = lf3*R, 



where: 



<lout < Ifs 
<Vout < Via 



Tr = Tf ~3*R L (C L +C 0UT ) 
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Figure 5-12. Typical Output Configuration 
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Output Delay and Rise Time versus Load Capacitance 



nom + 6 

nom + 4 

Typical 
Output 

Delay nom + 2 

(ns) 

nom 


















































50 ,,; : .?:75 ;1&d;- : \ 125.. 150 


NOTE: 

This graph will not be linear outside of the C L rang| 


j shown 


. nom- 


(picoferads) 

nominal value given in 


240855-30 
A.C. Characteristics table. 



Figure 5-13. Typical Output Valid Delay versus Load Capacitance under Worst Case Conditions 
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Figure 5-14. Typical Output Rise Time versus Load Capacitance under Worst Case Conditions 
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6.0 MECHANICAL DATA 



Packaging Outlines and Dimensions 

Intel packages the 82750DB in a Plastic Quad Flat 
Pack (PQFP). Table 6-1 gives the symbol list for the 
PQFP. 

Table 6-1 . PQFP Symbol List 



Letter or 
Symbol 


Description of Dimensions 


A 


Package Height: Distance from 
Seating Plane to Highest Point of 
Body 


A 1 


Standoff: Distance from Seating 
Plane to Base Plane 


D/E 


Overall Package Dimension: Lead 
Tip to Lead Tip 


D1/E1 


Plastic Body Dimension 


D 2 /E 2 


Bumper Distance 


D3/E3 


Footprint 


D4/E4 


Foot Radius Location 


Li 


Foot Length 


N 


Total Number of Leads 



The PQFP has the following specifications: 

1. All dimensions and tolerances conform to ANSI 
Y14.5M-1982. 

2. Datum plane-H-is located at the mold parting line 
and coincident with the bottom of the lead where 
lead exits plastic body. 



3. Datums A-B and -D- are to be determined where 
center leads exit plastic body at datum plane -H-. 

4. Controlling dimension is the inch. 

5. Dimensions D-|, D2, E-|, and E2 are measured at 
the mold parting line and do not include mold pro- 
trusion. Allowable mold protrusion is 0.18 mm 
(0.007 in.) per side. 

6. Pin 1 identifier is located within one of the two 
zones indicated. 

7. Measured at datum plane -H-. 

8. Measured at seating plane datum -C-. 

Table 6-2 provides outline characteristics for 
0.025-in. pitch. 

Table 6-2. Intel Case Outline Drawings 
for PQFP at 0.025 Inch Pitch 




Symbol 


Description 


Min 


Max 


N 


Leadcount 


132 


132 


A 


Package Height 


0.160 


0.180 


Ai- 


Standoff 


0.020 


0.040 


D,E 


Terminal Dimension 


1.070 


1.090 


DlE, 


Package Body 


0.947 


0.953 


D 2( E 2 


Bumper Distance 


1.097 


1.103 


D 3 , E 3 


Lead Dimension 


0.800 REF 


0.800 REF 


D 4 ,E 4 


Foot Radius 
Location 


1.023 


1.037 


Li 


Foot Length 


0.020 


0.030 



E2 E El 




J0.20 (.008)©lc|A(D-B(D|O(D| 



^£B3 



BASE PLANE 
JU-A1 



frj.g.20 (,008)(M)|c|A(D-B(D|D(Dl 



mm (inch) 



^SEATING PLAN E 
Q|0.10 (.004) | 



Figure 6-1. Principal Dimensions of the 82750DB in the 132-Lead PQFP Package 
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|$|0.13 (.085)©|CiA©-a(DlO(DlA 



■•-a. 41 <.0i6) 

0.20 (.008) 



0.31 (.012) HK 
0.20 (.008) 



mm (inch) 



1^10.20 (,008)©lC|A©-B(DlO(s)1^ 

DETAIL J 




0.20 (.008) 
0.14 (.005) 



DETAIL L 



8 OEG. 
OEG. 



240855-36 



Figure 6-2. 132-Lead PQFP Mechanical Package Detail— Typical Lead 



1.32 (.052)*- 
1.22 (.048) 




0. 


90 


(.035) MIN. 

IV 


2 


L-, 






1 mi 






1.32 (.052) UU \ 




t 


1.22 (.048) 

0.90 (.035) MIN. -m 

2.03 (.080) ~ 
1.93 (.076) 


1. 


- 


\ 
2.03 (.082 
1.93 (.074 

240855- 


) 
► ) 


DETAIL M 

mm (inch) 




34 



Figure 6-3. 132-Lead PQFP Mechanical Package Detail— Protective Bumper 



E2 El 




0.25 <.010)©|ClA(I)-B(I)|D(DlA 



,002 MM/MM < IN/IN) | A-S | 



0.25 (.010)©lClA(|)-B(I)|D(DlA 



,002 MM/MM < IN/IN) | A-B 



( |)-B 

>[a-b 



_1 



3.81 (.150) MAX TYP 



T 

.SEE DETAIL M 
— 1.91 (.075) MAX TYP 



0.25 ( 



.002 MM/MM 



.010)(g)|C|A(S)-B(DlO(D| 
M/Mh (IN/IN)10| 



0.25 (.010)©lC|A(I)-B(DlD©|A 



.002 MM/MM (IN/ IN HO" 
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mm (inch) 



Figure 6-4. Detailed Dimensions of the 82750DB in the 132-Lead PQFP Package— Molded Details 
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IK 



0.^,35 (0.025) 




rat 



- D3/E3 - 
- D4/E4 - 
-D/E- 



-SEE DETAIL L 
■SEE DETAIL J 



mm (inch) 




Figure 6-5. Detailed Dimensions of the 82750DB in the 132-Lead PQFP Package— Terminal Details 



/\ ALL DIMENSIONS AND TOLERANCES C0NF0R1 TO ANSI Y14.5M-1982 

fi\ DATUM PLANE Q2 LOCATED AT TH£ KOLO PARTING LINE ANO 

COINCIDENT WITH THE BOTTOM OF THE LEAO BHERE LEAO EXITS PLASTIC BOOY 

/z\ DATUMS (O AND QB TO BE DETERMINED EfHERE CENTER LEADS EXIT 
PLASTIC BOOY AT OATUM PLANE EH3 



A 



CONTROLLING DIMENSION, INCH 



A DIMENSIONS 01, 02, El AND E2 ARE MEASURED AT THE MOLD PARTING LINE. 
Dl AND El DO NOT INCLUDE AN ALLOOABLE HOLD PROTRUSION OF 1. 18 MM 
(.007 IN) PER SIDE. W AND E2 DO NOT INCLUDE A TOTAL ALLOOABLE 
MaO PROTRUSION OF 0.10 KM (.007 IN) AT MAXIMUM PACKAGE SIZE. 

/£\ PIN I IDENTIFIER 18 LOCATED HlTHIN ONE OF THE TCO ZONES INDICATED 

/T\ MEASURED AT OATUM PL/VNE Q3 

/s\ MEASUREO AT SEATING PLANE OATUM S3 
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Package Thermal Specifications 

The 82750DB is specified for operation when Tc 
(the case temperature) is within the range of 0°C to 
95°. Tc may be measured in any environment to de- 
termine whether the 82750DB is within specified op- 
erating range. The case temperature should be mea- 
sured at the center of the top surface. 

Ta (the ambient temperature) can be calculated 
from #ca (thermal resistance from case to ambient) 
with the following equation: 



T A = T C - P * C A 

Typical values for CA at various airflows are given 
in Table 6-3 for the 132-lead PQFP package. When 
using the digital outputs, Table 6-4 shows the maxi- 
mum T A allowable (without exceeding T c ) at various 
airflows. The power dissipation (P) is calculated by 
using the typical supply currents at 5V as shown in 
Table 5-2. 

Similarly, when using the analog outputs, the maxi- 
mum T A allowed is a function of Ifs. The equation for 
calculating the power is given in the following 
equation which can then be used in calculating the 
maximum T A . 

> = 5V*(iccNT + (3*l fe + 6)) 



Table 6-3. Therman Resistances (°C/W) 





#CA Versus Airflow — ft/min (m/sec) 


Package 



(0) 


200 
(1.01) 


400 
(2.02) 


600 
(3.04) 


800 
(4.06) 


1000 
(5.07) 


132-Lead PQFP 


26.0 


17.5 


14.0 


11.5 


9.5 


8.5 



Table 6-4. Maximum Ta at Various Airflows (°C) 





Ta Versus Airflow — ft/min (m/sec) 


Package 


Frequency 
(MHz) 



(0) 


200 
(1.01) 


400 
(2.03) 


600 
(3-04) 


800 
(4.06) 


1000 
(5.07) 


132-Lead PQFP 


28 


71 


79 


82 


84 


86 


87 


45 


59 


71 


75 


79 


82 


83 
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82750PB 
PIXEL PROCESSOR 



25 MHz Clock with Single Cycle 
Execution 

Zero Branch Delay 

Wide Instruction Word Processor 

512 x 48-Bit Instruction RAM 

512 x 16-Bit Data RAM 

Two Internal 16-Bit Buses 

ALU with Dual-Add-With-Saturation 
Mode 

Variable Length Sequence Decoder 



Pixel Interpolator 

High Performance Memory Interface 

— 32-Bit Memory Data Bus 

— 50 MBytes per Second Maximum 

— 25 MBytes per Second with Standard 
VRAMsorDRAMs 

16 General-Purpose Registers 

4 Gbyte Linear Address Space 

132-Pin PQFP 

Compatible with the 82750PA 




Intel's 82750PB is a 25 MHz wide instruction processor that generates and manipulates pixels. When paired 
with its companion chip, the 82750DB, and used to implement a DVI Technology video subsystem, the 
82750PB provides real time (30 images/sec) pixel processing, real time video compression, interactive motion 
video playback and real time video effects. 

Real time pixel manipulations, including 30 images/sec video compression, are supported by the 25 MHz 
instruction rate. On-chip instruction RAM provides programmability for execution of a wide range of algorithms 
that support motion video decompression, text, and 2D and 3D graphics. Inner loops are optimized with the 
integration of sixteen 16-bit quad ported registers, on-chip DRAM, and two loop counters that provide zero 
delay two-way branching "free" in any instruction. Two, 16-bit internal buses enable two parallel register 
transfers on each 82750PB instruction, contributing to the real time performance of the video processing. 
Another feature that adds to the processing power of the 82750PB is the 16-bit ALU, which includes an 8-bit 
dual-add-with-saturate operation critical for pixel arithmetic. Other specialized features for pixel processing 
include a 2D pixel interpolator for image processing functions and a variable length sequence decoder for 
decoding compressed data. 

The 82750PB is implemented using Intel's low-power CHM'OS IV Technology and is packaged in a 132-lead 
space-saving, plastic quad flat pack (PQFP) package. 
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82750PB Subsystem Diagram 



Intel Corporation assumes no responsibility for the use of any circuitry other than circuitry embodied in an Intel product. No other circuit patent 
licenses are implied. Information contained herein supersedes previously published specifications on these devices from Intel. February 1991 

©INTEL CORPORATION, 1991 ., ,—, Order Number: 240854-003 
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1.0 82750PB PIN DESCRIPTION 



Pinout 



9 
10 
11 
12 
13 
14 
15 
16 
17 
IB 
19 
20 
21 
22 
23 
24 
25 
26 
27 
2B 
29 
30 
31 
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1 109 107 105 103 101 
110 108 106 104 102 100 



ooooooooooooooooooooooooooooooooo 




vcc 


VSS 023 VCC 026 VCC 028 030 VSS VSS A31 A29 VCC 


A27 A25 A24 VSS VCC 




O vss 


022 024 025 VSS 027 029 031 VCC CIKOUT A30 A28 VSS 


A26 VCC A23 


VSS 


O 


O vcc 


m 




vcc 


O 


O 021 




A22 


O 


O 020 






A21 


O 


O 019 






A20 


O 


O D1B 






vss 


O 


O 017 






A19 


O 


O 016 






A1B 


O 


O 015 






vcc 





O vss 






A17 





O 014 






VSS 





O 013 






A16 





O 012 






A15 





O on 

O Dio 

O 09 


82750PB Pinout 




A14 
A13 








TOP VIEW 




A12 


O vss 






A11 





O OB 






VCC 





O 07 






A10 





O 06 






A9 





O vss 






AB 





O 05 






A7 





O 04 






A6 





O 03 






VSS 





O vss 






vcc 





02 






A5 





O Di 






A4 





O oo 






A3 





O vss 

O HINTf 
O HALlf 

O vss 
O vcc 


HROYf 

TRNFR# / 
HBUSENf . I HREG | BE2| tt0 | ^ vss vBUS[3l V8US(2:0] HALENf ^g 

VCC 1 /As 1 BE3#| BElll VCC 1 CLKIMl *E| | VCC | 1 1 HRCO| 1 > 

006000000000000000000000 


NXlFSTtf 
MROYf / 
MREOl / / RFSH^ 

7////rl 


A2 

PMFRZ0 

TEST# 

V5S 

VCC 













vss 
Q 


00000000 





34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 




Figure 1-1. 82750PB Pinout 
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Table 1-1. Pin Cross Reference by Pin Name 



Pin 
Name 


Location 


A2 


71 


A3 


72 


A4 


73 


A5 


74 


A6 


77 


A7 


78 


A8 


79 


A9 


80 


A10 


81 


A11 


83 


A12 


84 


A13 


85 


A14 


86 


A15 


87 


A16 


88 


A17 


90 


A18 


92 


A19 


93 


A20 


95 


A21 


96 


A22 


97 


A23 


102 


A24 


103 


A25 


105 


A26 


106 


A27 


107 


A28 


110 


A29 


111 


A30 


112 


A31 


113 


BE0# 


44 


BE1# 


43 


BE2# 


42 



Pin 
Name 


Location 


BE3# 


41 


CLKIN 


47 


CLKOUT 


114 


DO 


28 


D1 


27 


D2 


26 


D3 


24 


D4 


23 


D5 


22 


D6 


20 


D7 


19 


D8 


18 


D9 


16 


D10 


15 


D11 


14 


D12 


13 


D13 


12 


D14 


11 


D.15 


9 


D16 


8 


D17 


7 


D18 


6 


D19 


5 


D20 


4 


D21 


3 


D22 


130 


D23 


129 


D24 


128 


D25 


126 


D26 


125 


D27 


122 


D28 


121 


D29 


120 



Pin 




Name 


Location 


D30 


119 


D31 


118 


HALEN# 


55 


HALT# 


31 


HBUSEN# 


36 


HINT# 


30 


HRAM# 


58 


HRDY# 


38 


HREG# 


40 


HREQ# 


56 


MRDY# 


60 


MREQ# 


59 


NXTFST# 


61 


PMFRZ# 


70 


RESET # 


63 


RFSH# 


62 


TEST# 


69 


TRNFR# 


37 


VBUS[0] 


54 


VBUS[1] 


53 


VBUS[2] 


52 


VBUS[3] 


50 


v C c 


2 


v C c 


33 


v C c 


35 


v C c 


45 


Vcc ' 


51 


Vcc 


65 


Vcc 


67 


Vcc 


75 


Vcc 


82 


Vcc 


91 


Vcc 


98 



Pin 




Name 


Location 


Vcc 


100 


Vcc 


104 


Vss 


94 


Vcc 


109 


Vcc 


116 


Vcc 


123 


Vcc 


127 


Vcc 


132 


Vss 


1 


Vss 


32 


Vss 


34 


Vss 


39 


Vss 


48 


Vss 


57 


v ss 


66 


Vss 


68 


Vss 


76 


Vss 


89 


Vss 


99 


Vss 


101 


Vss 


108 


Vss 


11.5 


Vss 


117 


Vss 


124 


Vss 


131 


Vss 


10 


Vss 


17 


VSS 


21 


Vss 


25 


v ss 


29 


Vss 


46 


Vss 


64 


WE# 


49 
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Table 1-2. Pin Cross Reference by Location 



Location 


Pin 
Name 


1 


Vss 


2 


Vcc 


3 


D21 


4 


D20 


5 


D19 


6 


D18 


7 


D17 


8 


D16 


9 


D15 


10 


Vss 


11 


D14 


12 


D13 


13 


D12 


14 


D11 


15 


D10 


16 


D9 


17 


Vss 


18 


D8 


19 


D7 


20 


D6 


21 


Vss 


22 


D5 


23 


D4 


24 


D3 


25 


Vss 


26 


D2 


27 


D1 


28 


DO 


29 


Vss 


30 


HINT# 


31 


HALT# 


32 


Vss 


33 


Vcc 





Pin 


Location 






Name 


34 


Vss 


35 


Vcc 


36 


HBUSEN# 


37 


TRNFR# 


38 


HRDY# 


39 


Vss 


40 


HREG# 


41 


BE3# 


42 


BE2# 


43 


BE1# 


44 


BE0# 


45 


Vcc 


46 


Vss 


47 


CLKIN 


48 


Vss 


49 


WE# 


50 


VBUS[3] 


51 


Vcc 


52 


VBUS[2] 


53 


VBUS[1] 


54 


VBUS[0] 


55 


HALEN# 


56 


HREQ# 


57 


Vss 


58 


HRAM# 


59 


MREQ# 


60 


MRDY# 


61 


NXTFST# 


62 


RFSH# 


63 


RESET # 


64 


v S s 


65 


Vcc 


66 


Vss 



Location 


Pin 
Name 


67 


VCC 


68 


Vss 


69 


TEST# 


70 


PMFRZ# 


71 


A2 


72 


A3 


73 


A4 


74 


A5 


75 


Vcc 


76 


Vss 


11 


A6 


78 


A7 


79 


A8 


80 


A9 


81 


A10 


82 


Vcc 


83 


A11 


84 


A12 


85 


A13 


86 


A14 


87 


A15 


88 


A16 


89 


Vss 


90 


A17 


91 


Vcc 


92 


A18 


93 


A19 


94 


Vss 


95 


A20 


96 


A21 


97 


A22 


98 


Vcc 


99 


Vss 



Location 


Pin 
Name 


100 


Vcc 


101 


Vss 


102 


A23 


103 


A24 


104 


Vcc 


105 


A25 


106 


A26 


107- 


A27 


108 


Vss 


109 


Vcc 


110 


A28 


111 


A29 


112 


A30 


113 


A31 


114 


CLKOUT 


115 


Vss 


116 


Vcc 


117 


Vss 


118 


D31 


119 


D30 


120 


D29 


121 


D28 


122 


D27 


123 


Vcc 


124 


Vss 


125 


D26 


126 


D25 


127 


Vcc 


128 


D24 


129 


D23 


130 


D22 


131 


Vss 


132 


Vcc 
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Figure 1-2. 82750PB Functional Signal Groupings 
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Quick Pin Reference 

Table 1-3 provides descriptions of 82750PB pins. 



Table 1-3. Pin Descriptions 



Symbol 


Type 


Name and Function 


CLKIN 


I 


CLKIN is a 1X CLOCK INPUT that provides the fundamental timing for the 
82750PB. One cycle of CLKIN is denoted as one T-cycle. 


RESET# 


I 


The 82750PB is reset and initialized by holding this signal active for at least ten 
T-cycles. Refer to Initializing the 82750PB Section in Chapter 3. 


HREQ# 


I 


The HOST REQUEST signal is a request from the host CPU to perform a read 
or write access to either registers on the 82750PB, an external device, or to 
VRAM shared by the 82750PB and the host. The type of access that is 
requested is determined by the host access definition signals: HREG#, 
HRAM#,andWE#. 


HREG,# 
HRAM# 


I 


The HOST REGISTER and HOST RAM signals, when validated by HREQ#, 
are used to define three host access cycles. HRAM# active indicates the host 
is requesting a VRAM read or write cycle. HREG# active indicates that the 
host is requesting a 82750PB register read or register write cycle. When both 
signals are inactive, a host external cycle is requested. 


HBUSEN# 





HOST BUS ENABLE is asserted by the 82750PB at the start of a host access 
to indicate that the 82750PB Address and Data buses (A[31 :2], BE# [3:0], and 
D[31 :0]) have been tri-stated. This allows the host to drive the same buses 
either for accessing shared VRAM or the 82750PB internal registers. 


HALEN# 


I 


The HOST ADDRESS LATCH ENABLE signal is used to indicate to the 
82750PB that the host has asserted a valid address (A[31 :2], BE# [3:0]) and 
write enable (WE #). 


HRDY# 





HOST READY is asserted by the 82750PB at the end of a host access to 
indicate that the access cycle is ready for data transfer. For a host write cycle, 
HRDY# indicates that the 82750PB is ready to accept data from the host. For 
a host VRAM write cycle, HRDY# indicates that the VRAM has latched the 
data from the host. For a host read cycle, HRDY# indicates that output data 
from the 82750PB or VRAM is ready to be latched by the host. 


HINT# 





HOST INTERRUPT: This output is asserted when an interrupt condition is 
detected by the 82750PB, and the enable bit in the PROCESSOR CONTROL 
register corresponding to that interrupt condition is set to a ONE. HINT# stays 
active until the host CPU reads the INTERRUPT STATUS register. If an 
interrupt condition that is enabled occurs during the same cycle that the 
INTERRUPT STATUS register is being read, HINT# remains active. 


D[31:0] 


I/O 


The DATA BUS is used to transfer data between: 

1 . The 82750PB and VRAM, and 

2. The Host CPU and internal 82750PB registers. During host VRAM accesses, 
this bus is tri-stated to allow the host to share the same VRAM data bus. During 
host accesses to internal 82750PB registers all 32 bits are used for data 
transfer. 


A[31:9] 
A[8:2] 



I/O 


The ADDRESS BUS is shared between the 82750PB and the host for 
addressing VRAM. This 30-pin bus addresses 32-bit double words in VRAM. 
Byte Enable signals are used to address individual bytes or words within a 
double word in VRAM. In addition, the address for host accesses to internal • 
82750PB registers are communicated to the 82750PB using the lower seven 
pins, A[8:2], and the BE# pins. During host access cycles to either VRAM or 
82750PB internal registers, A[31 :2] are tri-stated. For internal register 
accesses, as indicated by HREG# being low, the lower seven bits, A[8:2], are 
used as the host address input. 


CLKOUT 





The CLOCK OUTPUT signal is one of the two internal clocks and is 
synchronized with CLKIN. It is always driven and will have a 50% duty cycle. 
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Table 1-3. Pin Descriptions (Continued) 



Symbol 


Type 


Name and Function 


BE #[3:0] 


I/O 


The BYTE ENABLE BUS is shared by the 82750PB and the host for 
addressing VRAM down to the byte level. The correspondence between 
the four Byte Enable pins and the D[31:0] pins is: BE# [3]-D[31:24], 
BE#[2]-D[23:16], BE#[1]-D[15:8], and BE#[0]-D[7:0]. During VRAM 
read cycles, the 82750PB enables all four bytes. During write cycles the 
82750PB only enables those bytes that are to be written. Bytes that are 
not enabled are not to be altered in VRAM. During host accesses to 
82750PB on-chip registers, the BE # [0] pin is used as an input to select 
whether the even or odd word is being accessed; the double word 
address is provided by the host on the A [8:2] pins. BE# [0] = indicates 
that data is transferred on D[15:0]. BE#[0] = 1 indicates that data is 
transferred on D [3 1:1 6]. 


MREQ# 





MEMORY REQUEST is asserted for the first cycle, T1 , of each VRAM 
cycle. 


TRNFR#, 
RFSH# 





The MEMORY CYCLE DEFINITION SIGNALS: Transfer, Refresh and 
Write Enable are asserted at the same time as MREQ#, but stay active 
for the entire VRAM cycle. TRNFR# active indicates a VRAM transfer 
cycle. RFSH # active indicates a VRAM refresh cycle. If neither TRNFR # 
nor RFSH # are active, a VRAM data read or write cycle is requested. 


WE# 


I/O 


The WRITE ENABLE pin is used as an output during a 82750PB/VRAM 
cycle to drive the WE# signal, which defines the access as a VRAM read 
cycle (when inactive) or write cycle (when active). During Host/ VRAM 
and Host External cycles, the 82750PB tri-states this pin to allow the host 
to drive the VRAM write enable signals directly. During Host/register 
cycles, this pin is used as an input for the Host Write Enable signal to 
determine whether the host is reading or writing the 82750PB register. 


NXTFST# 





The NEXT FAST signal indicates that the following vram cycle can be 
performed with a page-mode or bank-interleaved access. This signal is 
asserted during the first of a pair of VRAM cycles that is guaranteed to be 
within the same VRAM page and in opposite banks — a pair of accesses 
to two sequential double words in VRAM at addresses Even Address and 
Even Address + 1 . In other words, A [2] is a zero for the first cycle and a 
one for the second cycle. 


MRDY# 


I 


The MEMORY READY input indicates that the VRAM cycle has 
progressed to the point where it is ready to perform the data transfer. For 
a VRAM read cycle, the VRAM data can be latched by the transition of 
. MRDY# to an active state. For a VRAM write cycle, MRDY# indicates 
that the data has been latched into the VRAMs. 


VBUS[3:0] 


I 


The VDP COMMUNICATION BUS is used to communicate from the 
82750DB to the 82750PB. Codes sent over this bus indicate interrupt 
requests, transfer requests, and status information. Since the 82750DB 
and 82750PB run asynchronously, the VBUS signals are sampled on the 
falling edge of CLKIN and compared with the previous sample. For a 
VBUS code to be detected by the 82750PB, it must be valid for two 
successive samples. 


HALT# 


I 


The HALT signal causes the microcode processor on the 82750PB to 
halt prior to executing the next instruction. This signal does not halt the 
VRAM interface. The Halt signal will allow the design of a hardware 
emulator for the 82750PB based on an 82750PB chip. 


TEST# 


I 


The TEST signal is used for test purposes only and must remain high for 
normal operation. 
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Table 1-3. Pin Descriptions (Continued) 



Symbol 


Type 


Name and Function 


PMFRZ# 





The PERFORMANCE MONITORING AND FREEZE signal is toggled by 
specific microcode instructions and can be used to determine the time 
required to execute certain sections of microcode. 


Vcc 


I 


POWER pins provide the + 5V D.C. supply input. 


Vss 


I 


GROUND pins provide the OV connection to which all inputs and outputs 
are referenced. 



Table 1-4. Output Pins 



Table 1-5. Input Pins 



Name 


Active 
Level 


When 
Floated 


CLKOUT 


High 


Always Driven 


A[31:9] 


. High 


Reset*, Host Cycle 


HBUSEN# 


Low 


Reset* 


HRDY# 


Low 


Reset* 


HINT# 


Low 


Reset* 


MREQ# 


Low 


Reset* 


TRNFR#, 
RFSH# 


Low 


Reset* 


NXTFST# 


Low 


Reset* 


PMFRZ# 


Low 


Reset* 



Name 


Active 
Level 


Synchronous/ 
Asynchronous 


CLKIN 


High 


Synchronous 


RESET# 


Low 


Asynchronous 


HREQ# 


Low 


Asynchronous* 


HREG# 


Low 


Synchronous 


HRAM# 


Low 


Synchronous 


MRDY# 


Low 


Synchronous 


VBUS[3:0] 


High 


Asynchronous 


HALT# 


Low 


Synchronous 


HALEN# 


Low 


Asynchronous* 




*Can be programmed to accept synchronous inputs. 



*The reset state is caused by RESET # being active low. 

Table 1-6. Input/Output Pins 



Name 


Active Level 


When Floated 


Synch/Async 


D[31:0] 


High 


Reset*, Host Cycle 


Synchronous 


A[8:2] 


High 


Reset*, Host Cycle 


Synchronous 


BE #[3:0] 


Low 


Reset*, Host Cycle 


Synchronous 


WE# 


Low 


Reset*, Host Cycle 


Synchronous 



*The reset state is caused by RESET* being active low. 

All output pins are floated when RESET is active 



low. 
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2.0 ARCHITECTURE 



Overview 

The 82750PB Includes a wide instruction word 
processor that comprises a number of processing, 
storage, and input/output elements. The wide in- 
struction word architecture allows a number of these 
elements to operate in parallel. The 82750PB exe- 
cutes one instruction every internal clock cycle or 
T-cycle. The various elements are connected via 
two 1 6-bit buses, the A bus and B bus, as shown in 
Figure 2-1. During each instruction execution cycle, 
data can be transferred from a bus source to a bus 
destination element on both buses. 



Registers 



[rN;N = 0-15} 



There are 16 general-purpose data registers, each 
16 bits wide, that are connected to both the A bus 
and B bus as both sources and destinations. These 
registers are designated r0-r15. All the registers are 



functionally identical except At?, which also includes 
logic for bit shifting and byte swapping. A register 
can source both the A bus and the B bus in the 
same cycle. A register cannot be the destination of 
both the A bus and the B bus in a single instruction. 
Because the registers are doubly latched, the same 
register may be both a source and destination in the 
same cycle. The result is that the data in the register 
prior to the current cycle will be driven on the source 
bus, and the data on the destination bus will be 
latched into the register at the end of the cycle. 

Register At? has additional logic to allow bit shifting 
and byte swapping. The value in At? can be shifted 
left or right one bit position per instruction cycle. For 
a right shift, the new MSB is equal to the old MSB; in 
other words, the value is sign-extended. For left 
shifting, the new LSB is equal to zero. RO can not be 
shifted and loaded in the same instruction. Byte 
swapping, on the other hand, only occurs when /t? is 
being loaded with a value from the A bus or B bus. 
Byte swapping causes the most significant byte and 
the least significant byte of the 16-bit value being 
loaded into At? to be interchanged. Refer to Chapter 
4 for a description of the SHFT microcode field that 
controls the shifting and swapping operations in rO. 



SEQUENCER 



<^ 
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ALU 



Table 2-2. ALU Opcodes 



\alu, cc\ 



The ALU performs 16-bit arithmetic and logic opera- 
tions, and can also be operated as two independent 
8-bit ALUs for the Dual-Add-with-Saturate operation. 
There are two fields in the microcode instruction that 
affect the operation of the ALU: the ALUOP field 
specifies the operation to be performed, and the 
ALUSS field specifies the source of the two ALU 
inputs. Refer to Chapter 4 for further information on 
these fields. 

The two ALU operands either come from values 
held in the ALU input latches or from "eavesdrop- 
ping" on the A or B buses. The result of any ALU 
operation is latched in the ALU output register, alu. 
In a subsequent instruction this result can be trans- 
ferred to any A or B destination. 

The ALU has four condition flag outputs: CarryOut, 
Sign, Overflow, and Zero. CarryOut is the carry out 
of the most significant bit position. Sign is equal to 
the value of the most significant bit of the result. 
Overflow is the exclusive-OR of CarryOut and the 
Carryln to the most significant bit position of the re- 
sult. Zero is true (a value of 1) if all 16 bits of the 
ALU result are equal to zero. CarryOut and Overflow 
are defined as equal to zero for all logical opera- 
tions. For most ALU operations, the state of these 
four condition flags are latched when the operation 
is complete. There are eight operations (nop, a*, b*, 
+], -], 0*, prof and int) that are exceptions. These 
operations are performed without disturbing the 
condition state of the previous ALU operation. 

Microcode routines can read and write the ALU con- 
dition flag register, cc. This can be used to save and 
restore the state of these flags. The bit ordering of 
the ALU condition flags within cc axe given in Table 
2-1 .A complete list of ALU opcodes is given in Table 
2-2. 



Table 2-1 . Bit Assignments for cc Register 


Bit 


Condition 


BitO 


False (This bit of the cc is always read as 
a zero.)* 


Bit 1 


ALU Carry Out 


Bit 2 


ALU Overflow 


Bit 3 


ALU Sign 


Bit 4 


ALU Zero 


Bit 5 


Loop Counter Zero* 


Bit 6 


ROLSB* 


Bit 7 


ROMSB* 


Bit 15:8 


RESERVED. The state of these bits is 
undefined when read; write as zeros. 



Operation 


Mnemonic 


No Operation 


nop 


pass a 


a 


pass b 


b 


1's compliment of a 


~a 


1's compliment of b 


~b 


aANDb 


& 


(NOT a) AND b 


~& 


a AND (NOT b) 


&~ 


aORb 


I 


aXORb 


A 


a + b 


+ 


a + b + 1 


+ + 


a- b 


- 


-a + b 


- + 


2's compliment of a 


-a 


2's compliment of b 


-b 


Increment a 


a+ + 


Increment b 


b+ + 


Decrement a 


a — 


Decrement b 


b-- 


Dual Add with Sat. 


+ ] 


a + b + (Prev. Carry) 


+ < 


a - b - (Prev. Borrow) 


-< 


— a .+ b - (Prev. Borrow) 


- + '< 


Interrupt Host 


int 


Zero 


0* 


Pass a, Don't Latch Flags 


a* 


Pass b, Don't Latch Flags 


b* 


(NOT a) OR b 


~l 


a OR (NOT b) 


l~ 


Dual Sub. with Sat. 


-] 


Perform. Monitor/Profile 


prof 




The Dual-Add-with-Saturate operation performs in- 
dependent 8-bit ADDs on the upper and lower bytes 
of the two ALU operands. The two bytes of the A 
operand are treated as unsigned binary numbers 
(00:FF-ie corresponds to 0:255io)- The two bytes of 
the B operand are treated as offset binary numbers 



"These are read-only values and are not affected by writes to the cc 
register. 
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with an offset of +128 (00:FF-| 6 corresponds to 
- 128-1 o'.127io)- The upper and lower byte results 
are treated as 9-bit offset binary, including the carry 
output of each byte, with a +128 offset (000:1 FF 16 
corresponds to -128-io"-383-io) and are saturated to 
a range of 0-255-jo. A result that is less than zero is 
set equal to zero or 00-J6 and a result that is greater 
than +255 is set equal to +255 or FF-je- 

In fact, this operation is symmetric. Either the A op- 
erand or the B operand can be defined as the un- 
signed binary value, and the other operand will be 
treated as the offset signed binary value. 

Dual-subtract-with-saturate is similar to dual-add- 
with-saturate. It calculates A - B + 128 on each 
8-bit half of the two 16-bit inputs, and clamps the 
results to and 255. This can be viewed as subtract- 
ing an offset-binary signed byte (-128 to 127) from 
an unsigned byte (0 to 255). 

The ALU opcode 'int' generates the MCINT (micro- 
code interrupt) condition. When this condition is de- 
tected by interrupt logic in the host CPU interface, 
and if the Enable MCINT bit in the PROCESSOR 
CONTROL register is set to a ONE, the host inter- 
rupt output, HINT#, will be asserted. Refer to Chap- 
ter 3 for further information on host interface. 

The 'prof opcode activates the PMFRZ# pin, and is 
primarily used for performance monitoring and/or 
debugging. 

Barrel Shifter 

{shift, shift-r, shift-rl, shift-l) 

The barrel shifter performs a single cycle, n-bit left or 
right shift. The barrel shifter operates independent of 
the ALU. The three barrel shifter operations are: 
Shift-r for a right shift with sign extend; Shift-rl for 
right shift with zero fill; and Shift-l for a left shift with 
zero fill. The shift operation is invoked by writing a 
4-bit value (the shift amount) to one of three A bus 
registers, depending on which of the three opera- 
tions is to be performed. The operand is taken from 
the B bus, and the result is stored in the barrel shift- 
er output register, Shift. Like the ALU result register, 
the value in Shift can be read onto the A bus or B 
bus in the following instruction cycle. 

A barrel shifter operation does not affect any of the 
condition flags. 



Data RAM 

[dramN, *dramN, ++, --;/V= 1-4] 

The Data RAM holds 512, 16-bit words that are ac- 
cessed using four pointers. To access a value in a 
particular location, the microcode routine must first 
load a pointer with the address to be accessed, and 
then perform a read or write using the same pointer. 
In parallel with the data RAM access, the pointer 
can optionally be post-incremented or post-decre- 
mented. The four pointers, referred to as dram 1- 
dram4, can be written and read via the A bus. When 
a dram pointer, which is only 9 bits wide, is read onto 
the A bus, its upper seven bits are set to zeros. 

NOTE: 



The width of the dram pointers may change in 
later versions of the 82750PB. Software should 
not rely on the width of a pointer to, for exam- 
ple, mask the upper seven bits of a value to 
zero. 



All four pointers can be used to read or write the 
Data RAM from either the A or B bus. Only one Data 
RAM access can be performed in any cycle. A Data 
RAM access is referred to, using C language syntax, 
as *dram1. The * means "the value pointed to by". 
As another example, *dram3+ + means access the 
Data RAM using the pointer dram3 and increment 
dram3. The symbol — in place of the + + would 
indicate autodecrement. 



Loop Counters 

\cnt,cnt2) 

Two 16-bit loop counters are available to microcode 
programs for automatically counting iterations of a 
microcode loop. In parallel with other operations 
performed in an instruction, either loop counter can 
be decremented, and a conditional branch can be 
made based on the loop counter value being equal 
or not equal to zero. Since the two loop counters 
can be written and read on the A bus, as cnt and 
cnt2 respectively, they can also be used for variable 
storage when not being used as loop counters. The 
loop counters can be written to and decremented 
during the same instruction cycle. The value in the 
counter at the start of the next cycle will be the value 
written to the counter minus one. 

The LC microcode bit determines the loop counter 
that is selected for decrementing and/or branching 
in an instruction. The LC microcode bit does not af- 
fect the loop counter that is written or read over the 
A bus, since each loop counter is separately ad- 
dressable as a A bus source or destination. Refer to 
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Chapter 4 for a description of the CNT — micro- 
code bit that causes the select loop counter to be 
decremented, and for a description of the CFSEL 
microcode field that is used to perform a conditional 
branch based on the selected loop counter's value. 



Microcode RAM 

{ mcode 1-3, maddr, pc ) 

The 82750PB executes instructions stored in an on- 
chip microcode RAM. This RAM holds 512 instruc- 
tions and each instruction is 48 bits wide. Normally, 
to start the microcode processor, the host CPU will 
load a microcode program into the microcode RAM, 
point the program counter, pc, to the start of the 
program, and then release the HALT bit to start exe- 
cuting the microcode program. The microcode proc- 
essor can also load its own microcode RAM to over- 
lay new routines and therefore, does not require 
constant intervention by the host to perform multiple 
operations. 

Writing an instruction into Microcode RAM is done 
by first loading the three registers mcode3, mcode2, 
and mcodel with the three 16-bit words of the in- 
struction (the most significant word goes into 
mcodel ), and then loading the address where the 
instruction should be written into maddr. 

The host CPU can also read the Microcode RAM by 
first loading the pc with the address of the instruc- 



tion to be read and then reading the three 16-bit 
words of the instruction from the mcodel -mcode3 
registers. Normally, this would be done by the Host 
CPU while the 82750PB is halted. Since mcodel - 
mcode3 hold the instruction pointed to by the pc (i.e. 
the instruction that is about to be executed), normal- 
ly reading these three registers from a microcode 
routine is not useful. 

The read registers named mcodel -mcode3 and the 
write registers also named mcodel -mcode3 are in 
fact different registers. Writing values into mcodel - 
mcode3 and then reading the values of mcodel - 
mcode3 will not read back the same values just writ- 
ten. The read registers hold the instruction stored ir 
the instruction latch (the instruction to be executed). 
The write registers hold an instruction that is about 
to be written into microcode RAM. 

After writing to maddr to load an instruction into mi- 
crocode RAM, a one cycle freeze occurs and during 
the freeze a write to the microcode RAM takes 
place. The instruction following the write to maddr 
can either jump to the address just loaded or start 
loading the mcodel -mcode3 registers with the next 
instruction to be written. 

Here are two examples that illustrate the fact that 
the 82750PB requires at least one instruction be- 
tween the write to maddr and the execution of the 
instruction that is loaded by the write to maddr. 




Example 1: 








maddr = ADDRl 
jmp addrl 




/* 
/* 
/* 


load instruction */ 

jump to it, this is the extra inst. required between */ 

writing to maddr and executing the loaded inst. */ 


ADDRl: 
??????????? 




/* 


here's where new instruction gets loaded */ 


Example 2: 








maddr = INST 

nop 

INST: 

??????????? 




/* 
/* 


extra instruction */ 
instruction gets loaded here */ 


When a microcode routine writes to pc, one more instruction is executed before the jump to the new address 
takes effect. For example: 


pc = ADDRl 
rO = rl jmp 


ADDR2 


/* 
/* 


this instruction gets executed but */ 
its jump to ADDR2 is ignored. */ 


ADDRl: 
r3 = rO 




/* 


after this instruction executes r3 = rO = rl */ 
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When the host CPU writes to the pc, the instruction 
at the address that was written is loaded into the 
mcode1-mcode3 registers and, when the micro- 
code processor is released from its Halt condition, 
this is the first instruction that will be executed. 

When the host CPU reads the pc, the result returned 
is the address of the instruction that will be executed 
when Halt is released, that is, the address of the 
instruction held in the mcodel -mcode3 registers. 



Horizontal Line Counter 

{lent) 

The 12-bit Horizontal Line Counter is updated by 
VBUS codes from the 82750DB to track the horizon- 
tal display line that is currently being scanned by the 
82750DB. The counter is reset by a VODD code and 
incremented each time an HLINE code is received. 
A value can also be written into a Horizontal Line 
Counter but this is used primarily for testing the 
82750PB. The upper four bits will always read zeros. 



Field Counter 



[font] 



The 4-bit field counter is updated by VBUS codes 
from the display processor to keep track of the field 
count being displayed by 82750DB. The counter is 
incremented each time a Vqdd code or Veven code 
is received. When reading the field counter, the up- 
per 12 bits will read zeros. This counter will not be 
initialized upon reset. 



Input FIFOs 

{inN-lo, inN-hi, inN-c, *inN; N = 1,2} 

There are two input channels, referred to as input 
FIFOs, through which the processor can read pixels 
or data from VRAM. Each channel automatically 
fetches 64-bit quad words from VRAM and breaks 
them into 8-bit bytes or 16-bit words that are read by 
microcode. Each input FIFO operates independently 
and can be programmed to automatically increment 
or decrement through bytes or words in VRAM. The 
FIFOs are double buffered so that while values are 
being extracted from one quad word (64 bits), the 
next quad word is being prefetched from VRAM. 



The mode control register for each input FIFO, des- 
ignated in1-c or in2-c, contains four mode bits as 
seen in Figure 2-2. The WORD/BYTE bit (bit 0) de- 
termines whether the input FIFO is in word mode 
(WORD/BYTE = 0) or byte mode (WORD/BYTE = 
1). In byte mode, the FIFO can start reading on any 
byte boundary and in word mode on any word 
boundary. 

The INC/DEC bit (bit 1) determines the order that 
bytes or words are read from VRAM. In INCRE- 
MENT mode, with INC/DEC = 0, the FIFO reads 
from the least significant byte or word to the most 
significant byte or word of each double word and 
increments through double words in VRAM. In DEC- 
REMENT mode, with INC/DEC = 1, the FIFO reads 
from most significant byte or word to least significant 
byte or word within a double word and decrements 
through double words in VRAM. 

The AHOLD bit (Bit 2) is used by the address hold 
mode. When asserted, (bit 2=1) the automatic ad- 
dress increment/decrement function will be disabled 
and input FIFOs will not double buffer VRAM data. In 
other words, at the end of a VRAM cycle, when the 
FIFO has been updated with 64 bits of VRAM data, 
the input FIFO will not issue another MREQ# until 
there is a write to the address-lo registers OR a roll- 
over/roll-under read access of the input FIFO. If a 
roll-over/roll-under occurs, then a memory request 
will be issued to fetch data from the same VRAM 
location. If there is a write to the address-lo register, 
the FIFO will then fetch data from the new location. 

The PREFETCH OFF bit (bit 3) specifies whether 
the FIFO will automatically prefetch successive quad 
words from VRAM or will only fetch a new quad word 
when a value from that quad word is requested. In 
PREFETCH-ON mode, bit 3 = 0, the input FIFO pre- 
fetches successive quad words from VRAM as nec- 
essary to keep its buffer full (either from ascending 
or descending addresses, depending on the state of 
the INC/DEC bit). In PREFETCH-OFF mode, the 
FIFO will still prefetch the first two quad words to fill 
its buffer (when started at a new address location), 
but will only fetch a new quad word when a read 
request is made to the FIFO for a value in the next 
unfetched quad word. 

The CB bit (bit 4) allows circular buffers of sizes 
64 Kbytes, 128 Kbytes, or 256 Kbytes to be created 
in VRAM memory. The choice of different sizes of 
buffers are determined by programming the least 
signficant 3 bits of the circular buffer register (dr- 
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Figure 2-2. Input FIFO Control Register 



1-72 



intel. 



82750PB 



cbuf). To enable this feature, the CB bit has to be 
set to a 1, then depending on the buffer size 
selected, the appropriate address pin that goes off 
chip will be forced to a (register pointers remain 
unchanged). Table 2-3 shows the programming 
combinations of the circular buffer register. 

It is important to note that the internal address 
counters themselves are not affected by the circbuf 
function. Only the selected external address pin is 
forced to '0'. 



Table 2-3. Circular Buffer Register (circbuf) 


Bits [2:0] 


Buffer Size 


Effect on PB Address Bus 
(If Function Enabled) 


000 


Disabled 


None 


100 


256 Kbytes 


Address Pin 1 8 Forced to 


010 


128 Kbytes 


Address Pin 1 7 Forced to 


001 


64 Kbytes 


Address Pin 1 6 Forced to 



In "BY-32" MODE (bit 3), the pointer increments or 
decrements by 32 bits, independent of whether the 
FIFO is in 8-bit pixel mode or 16-bit pixel mode. This 
mode was added to facilitate microcode that oper- 
ates on one component of a 32-bit per pixel image. 

The standard sequence for initializing an input FIFO 
is to write to the control register (in-cj, the high ad- 
dress (in-hi), and then the low address (Mo) of the 
appropriate FIFO. Refer to the access state diagram 
in Chapter 3. The write to in-lo causes the FIFO to 
start reading from VRAM. A byte or word is then 
read from *in. Successive reads from *in will read 
sequential bytes or words from VRAM. Writing to the 
control register each time the FIFO is started at a 
new address is not necessary, except to change the 
FIFO's mode. Also, if the new address is within the 
same 64 kByte page of VRAM, only the lo-address 
needs to be written in order to start the FIFO reading 
from the new address. 

If microcode attempts to read a value from an empty 
input FIFO, the processor is frozen prior to the exe- 
cution of the instruction, until the FIFO's control log- 
ic has fetched another double word from VRAM and 
extracted the next value. At this point, the processor 
is released from the frozen state, and the instruction 
that reads the value is executed. When the proces- 
sor is frozen waiting for a particular FIFO that isn't 
yet ready, that FIFO's VRAM access priority is raised 
above all other FIFOs. 



Output FIFOs 

[outN-lo, outN-hi, outN-c, *outN, outN+ +; N = 1, 2\ 

There are two output channels, referred to as output 
FIFOs, through which the graphics processor writes 
pixels or data to VRAM. Each channel automatically 
collects bytes or words into 64-bit quad words and 
writes the quad words to VRAM. Each output FIFO 
operates independently and can be programmed to 
write bytes or words into sequential addresses in 
VRAM (either incrementing or decrementing). The 
FIFOs are double buffered so that while one quad 
word is waiting to be written to VRAM, the next quad 
word can be assembled from individual bytes or 
words. 

The mode control register for each output FIFO, 
designated out1-c or out2-c, contains six mode bits 
as shown in the Figure 2-3. The WORD/BYTE bit 
(bit 0) determines whether the output FIFO is in word 
mode (WORD/BYTE = 0) or byte mode (WORD/ 
BYTE = 1). In byte mode the FIFO can start writing 
on any byte boundary in VRAM and in word mode on 
any word boundary. 

The INC/DEC bit (bit 1) determines the order that 
bytes or words are written to VRAM. In INCREMENT 
mode, with INC/DEC = 0, the FIFO writes from the 
least significant byte or word to the most significant 
byte or word in a double word and increments 
through double words in VRAM. In DECREMENT 
mode, with INC/DEC = 1, the FIFO writes from 
most significant byte or word to least significant byte 
or word within a double word and decrements 
through double words in VRAM. 

When the AHOLD bit (bit 2) is set, the output FIFO 
quad word address is not incremented or decre- 
mented. In this mode, the FIFO continues to output 
to a single quad word in VRAM. 

The FORCE-LSB bits (bits 3 and 4) are used to force 
the least significant bit of each byte written to VRAM 
to either a zero or a one. This can be used, for ex- 
ample, to force the LSB to the correct polarity when 
writing to the U bitmap during motion video decom- 
pression. In certain display modes for the 82750DB, 
the LSB of the 8-bit samples in the U or Y bitmap are 
used to select VIDEO or GRAPHICS display mode 
for the n x n group of display pixels corresponding to 
the particular U or Y sample. A one in the FORCE- 




bits: 



15-6 


5 


4 


3 


2 


Set to Zeros 


BY-32 MODE 


FORCE-LSB 
ENABLE 


FORCE-LSB 
VALUE 


AHOLD 



1 

INC/DEC WORD/BYTE 



Figure 2-3. Output FIFO Control Register 
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LSB ENABLE bit (bit 4) enables the forcing; a zero 
results in normal operation. The FORCE-LSB VAL- 
UE bit (bit 3) is used as the value to which the LSB is 
forced. Whether in byte mode or word mode, the 
LSB of each byte is forced to the FORCE-LSB value. 

In "BY-32" MODE (bit 5), the pointer increments or 
decrements by 32 bits, independent of whether the 
FIFO is in 8-bit pixel mode or 16-bit pixel mode. This 
mode is used to facilitate microcode that operates 
on one component of a 32-bit per pixel image. The 
bytes or words that are skipped over will be un- 
changed in VRAM. 

The standard sequence for initializing an output 
FIFO is to write to the control register (out-c) } the 
low address (out-lo), and then the high address (out- 
hi) of the appropriate FIFO. A series of bytes or 
words is then written to *out Refer to the access 
state diagram in Chapter 3 (Figure 3-1). 

In order to flush any remaining data in an output 
FIFO before changing its VRAM pointer, it is neces- 
sary to write to the control register. When pointing to 
a new location in VRAM, if the new address is within 
the same 64 kByte page of VRAM, only the lo-ad- 
dress needs to be written. 

There must be one instruction between the write to 
the output FIFOs low address and the first write to 
*outN. Therefore, it is recommended that outN-lo be 
written before outN-hi. The write to outN-hi insures 
that this requirement is met. If only the outN-lo value 
is being changed, it is still necessary to have one 
additional instruction before the first write to *outN. 

When writing bytes or words to VRAM through an 
output FIFO, a byte or word can be skipped over by 
writing to outN+ -f instead of *outN. When the val- 
ues are written to VRAM, any byte or word that was 
skipped will retain its original value in VRAM, and its 
value is not altered by the VRAM write. This can be 
used when writing a series of pixels, some of which 
are "transparent", allowing whatever was behind 
them to show through. 

If the microcode routine attempts to write a value to 
a full output FIFO, the processor is frozen prior to 
the execution of the instruction. The processor re- 
mains frozen until the FIFO has a chance to write 
one of the buffered quad words to VRAM. At that 
point, the processor is released from the frozen 
state, and the instruction that writes the value is exe- 
cuted. When the processor is frozen, waiting for a 
particular FIFO that isn't yet ready, that FIFO's 
VRAM access priority is raised above all other 
FIFOs. 



Statistical Decoder 

\stat-lo, stat-hi, stat-c, stat-ram, *stat, *stat#] 

The Statistical Decoder (also referred to as the Huff- 
man Decoder) is a specialized input channel that 
can read a variable-length bit sequence from VRAM 
and convert it into a fixed-length bit sequence that is 
read by the microcode processor. In image com- 
pression, as well as in other applications such as 
text compression, certain values occur more fre- 
quently than others. A means of compressing this 
data is to use fewer bits to encode more frequently 
occurring values and more bits to encode less fre- 
quently occurring values. This type of encoding re- 
sults in a variable-length sequence in which the 
length of a symbol (the group of bits used to encode 
a single value) can range for example, from one bit 
to sixteen bits. 

The statistical code that the statistical decoder can 
decode is of either of the two forms: 



Ox 


1x 


10x 


01x 


110xxx 


001 xxx 


1110xxxxx 


0001xxxxx 


or 


11111110XXXXXX 


00000001 xxxxxx 


1.11.1 111 10XXXXXX 


000000001 xxxxxx 
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Each symbol of a given length (one per line as 
shown here) consists of a run-in sequence followed 
by some number of x-bits. The run-in sequence is 
defined as a series of zero or more ONEs followed 
by a ZERO or, as in the code on the right above, 
zero or more ZEROs followed by a ONE. The re- 
mainder of this description will use examples of the 
code on the left. A bit in the decoder's control regis- 
ter determines the polarity of the run-in sequence 
bits. 

In the example on the left, there would be two sym- 
bols of length two: 00 and 01 . Each x-bit can take on 
a ZERO or ONE value. The number of x-bits follow- 
ing a run-in sequence can range from zero to six. 
Since the goal, in general, is to have a few short 
codes and a larger number of long codes, typically, 
codes with fewer run-in bits will have fewer x's fol- 
lowing. However, this is not a hardware constraint. A 
code of this form is completely described by a code 
description table indicating: for each length of run-in 
sequence, R = the number of ONEs in the run-in, 
and how many x-bits follow the ZERO. The value of 
R is used as an index into the code description table. 
Due to the hardware implementation, the number 
actually stored in the table is 2 X , where x is the num- 
ber of x-bits. 

For the example above, the corresponding code de- 
scription values are given in Table 2-4. 
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Table 2-4. Sample Code Description Table 


R 


X 


2x(dec.) 


2*(bin.) 




1 

2 
3 

7 


1 
1 
3 
5 

6 


2 

2: 

8 
32 

64 


000 0010 
000 0010 
000 1000 
010 0000 

100 0000 



Note that the table only goes up to symbols with 
seven ONEs in the run-in. For symbols with more 
than seven ONEs, the value of X and 2 X for seven 
ONEs is used for all symbols having seven or more 
ONEs in the run-in sequence. For example, in the 
code above a symbol with eight or more ONEs in the 
run-in sequence has six x-bits following the ZERO, 
which is the same as symbols having seven ONEs. 

For each different symbol, including all symbols of 
the same run-in length with different x-bit values, the 
decoder generates a unique fixed-length, 16-bit val- 
ue. Some of the decoded values for the sample 
code given above are provided in Table 2-5. 

Table 2-5. Decoded Values 



Symbol* 


Decoded Value 


00 





01 


1 


100 


2 


101 


3 


110000 


4 


110001 


5 


110010 


6 






. 110111 


11 


111000000 


12 






111011111 


43 







The x-bits of the symbol are in boldface for clarity. 

The algorithm for generating a decoded value from a 
symbol is as follows: all symbols of a given run-in 
length are assigned a base value, B; the value corre- 
sponding to a particular symbol is equal to B plus the 
binary value of the x-bits in the symbol. The base 
valule B for a symbol with a run-in length of R is 
calculated by: 

B(R) = SUM[2*(r)] with r = to R - 1, 



where X(r) corresponds to the X value in the table 
entry corresponding to R = r. 

For example, in the above code: 

B(0) = 0, B (0) is always zero 

B(1) = + 2 = 2 

B(2) = + 2 + 2 = 4 

B(3) = + 2 + 2 + 8= 12 

B(4) = + 2 + 2 + 8 + 32 = 44 

This is one of the reasons that the table holds 2 X 
instead of X. The calculation of B(R) are easier to 
implement in logic. 

There are two enhancements that are made to this 
coding scheme in the implementation on the 
82750PB. These two modes are referred to as END 
mode and SHORT mode. If neither END nor SHORT 
mode are enabled, the decoding is performed as de- 
scribed above. SHORT mode allows the decoder to 
be switched easily to a simpler code format without 
having to reload the code description table. In the 
SHORT form, all symbols have the same number of 
x-bits, as though all entries in the table had been 
filled with the same value of 2 X . When SHORT mode 
is invoked, this value of 2 X is obtained from a field in 
the statistical decoder's CONTROL word, instead of 
from the individual table entries. 

END mode is added in recognition of the fact that, 
for codes with few symbols, some increase in effi- 
ciency is possible by not having to place a zero at 
the end of the longest run-in sequence. For exam- 
ple, consider the code: 



10x 

110x 

The END mode allows us to shorten the last symbol 
to 11x instead of 11 Ox. The trailing ZERO is not re- 
quired because the decoder has been told that the 
maximum length of a run-in is two ONEs. The result- 
ing symbol set and corresponding decoded values 
are given in Table 2-6. 

Table 2-6. END Mode Decoded Values 




Symbol 


Decoded Value 








100 


1 


101 


2 


110 


3 


111 


4 
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The number of x-bits must be constant for all sym- 
bols of the same run-in length. Therefore, a code 
such as: 



Table 2-8. Packed 3-Bit 
Field Decoded Values 





10xx 
11 xxx 



NOT CORRECT! . . . Must be 1 1xx. 



is not allowed. The last symbol (11 xxx, in this case) 
uses the same table entry for 2 X as the next to last 
symbol (10xx) and, therefore, the last symbol will be 
11xx. 

The maximum length of the run-in sequence in END 
mode is specified by placing an END flag in the code 
description table. For example, a code and the cor- 
responding table is shown in Table 2-7. 

Table 2-7. END Flag Decoded Values 



Code 


Table Entries 


Index 


END Bit 


2X 














10xx 


1 


) ■■ o 


4 


110xxx 


2 


1 


8 


1 1 1 xxx 


3 


- 


- 




4 


, < - 


.-,-:. ■- 




5 


• ■'- ■ .' 


- 




6 


- 


- 




7 


- 


'■-■'' 



The hyphens indicate that those table entries aren't 
used to decode this code. Note that the symbol 
1 1 1 xxx has three x-bits because of the value of 2 X in 
Index 2; it is not based on the 2 X value in Index 3. 

The SHORTED and END modes can be invoked 
simultaneously, resulting in a code such as: 

Ox 
10x 
110x 
111x 

with a SHORT -2* value = 2 (for \ x-bit in each 
symbol) and the END bit set in Index 2. 

Packed binary fields with one to seven bits per field 
can be read using the statistical decoder by setting 
the END bit in Index and by programming the X 
value to be N-1, where N is the number of bits per 
field. For example, packed three-bit fields could be 
decoded as shown in Table 2-8. 



Code 


Table Entries 


Index 


END Bit 


2X 


Oxx 





1 


4{N =;3,soX = 2) 


1xx 


1 


- 


- 




2 


- 


- 




3 


- 


- 




-;. 4 


" 


- - . ■ ■•" 




5 


- 


' ' " 




6 


- 


■ ■ - 




7 


- 


. '." ■'■ 



The unpacked bits are in reverse order relative to 
how they are stored in VRAM. For example, if three- 
bit values are packed in VRAM, the pattern 110 in 
VRAM is read from right to left and gives an un- 
packed or decoded value of 3. 

The CONTROL register for the statistical decoder 
(stat-c) is used to specify the mode to use for decod- 
ing, as well as to invoke certain modes for writing 
and reading the code description table. Refer to the 
bit assignments for this register below. To write to 
the code description table, the WRITE bit (bit 4) is 
set to a ONE; the starting table index is reset to 
zero. Each write to the table causes the index to 
increment by one. This index will wrap around from 
seven back to zero. For example, to write all eight 
table entries the user would write a value of 0x10 to 
stat-c register and then write eight 8-bit values to the 
register stat-ram. The most significant bit of each 
8-bit value is the END bit, and the lower seven bits 
are the values of 2 X . To read the code description 
table, the TEST bit (bit 5) of the CONTROL register 
is set to a one. The table entries are then read from 
the decoder's data register (*stat). Reads and writes 
always start at table entry zero. 

NOTE: 



When reading the code description table, it is 
necessary to wait one instruction time between 
the write to stat-c and the first read from *stat 
An access diagram showing all legal sequences 
for read and write FIFO registers is shown in 
Chapter 3 (Figure 3-1). 
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The- code for reading the eight table entries into the first eight locations of data RAM would be: 

dram3 = stat-c = 0x20 /test mode to read the stat-ram (the table) 

cnt = 8 /wait one inst. before first read 

LOOP: 

*dram3+ + = *statcnt 

jcp loop /two inst. loop necessary to wait one inst. 

/between each read from *stat. 



Bits 15 14 13 


12:8 


7 


6 


5 


4 


3 


2:0 


POL RSVD* CB 


SVAL 


SHORT 


END 


TEST 


WRITE 


RSVD* 


Starting 

Stat-ram 

ADDRESS 


* Reserved: write zeros to these bits. 


















Figure 2-4. Statistical Decode CONTROL Register 



END mode is enabled by setting the END bit (bit 6) 
in the CONTROL register to a ONE. The SHORT 
mode is enabled by setting the SHORT bit (bit 7) in 
the CONTROL register to a ONE. When in SHORT 
mode, the five SVAL bits (bits 12:8) in the CON- 
TROL register are used as the SHORT -2 X value. 

The POL bit (bit 15) determines the polarity of the 
run-in sequence bits. If bit 15 = 0, then ONEs end- 
ing in ZERO (e.g., 1110xxx) sequence is selected. If 
bit 15= 1, the ZEROs ending in ONE (e.g., 0001 xxx) 
sequence is selected. 

The CB bit (bit 13) allows circular buffers of sizes 
64 Kbytes, 128 Kbytes, or 256 Kbytes to be created 
in memory, as in the case of the input FIFO. The 
choice, of different sizes of buffers are determined 
by programming the least significant 3 bits of the 
circular buffer register (circbuf). To enable this fea- 
ture, the CB bit has to be set to a 1 , then depending 
on the buffer size selected, the appropriate address 
pin that goes off chip will be forced to a (register 
pointers remain unchanged). Table 2-3 shows the 
programming combination of the circular buffer 
register. 

The decoding parameters may be changed between 
symbols by writing to the CONTROL register and, if 
necessary, writing new values into the code descrip- 
tion table. The correct procedure for changing the 
code type or decode mode is to read the last value 
from the decoder prior to the change, using *stat* 
instead of *stat. This keeps the decoder from auto- 
matically starting to decode the next symbol. At this 
point, the code description table and the SHORT 
and END mode bits can be changed as desired. The 
next time the CONTROL register is written with both 
TEST = and WRITE = 0, the decoder will begin 
to decode the next symbol using the new parame- 
ters. 



word and the fetch of the next 32-bit word may over- 
lap. As with the input and output FIFOs, the decoder 
has a VRAM pointer associated with it that points to 
the location in VRAM from which it is reading data. 
This pointer increments twice each time a new quad 
word is read; there is no decrement mode. When the 
least significant word of the decoder's pointer (stat- 
lo) is written, any data that had previously been pre- 
fetched from VRAM is ignored, and the decoder 
fetches one quad word starting from this new loca- 
tion. 

The 82750PB assumes that the statistically encoded 
bitstream in VRAM starts with the least significant bit 
of a double word. That is, the two LSBs of the ad- 
dress written to start-lo are ignored. 

The statistical decoder decodes data at a rate of 
one bit per T-cycle. To a first approximation, the de- 
code time for an N-bit symbol is: 

decode time (in T-cycles) = N + 1 

Since it takes at least 64 T-cycles to decode data 
from one quad word, which is the time required fo 
eight quad word reads from VRAM, the decoder 
should rarely run out of data. Therefore, the above 
estimate should very accurately model the actual 
decoding rate of the statistical decoder. 

The statistical decoder always begins to read the 
bitstream from the least significant bit of the double 
word found at the starting location in VRAM. That is, 
the decoder does not start on a byte or word bound- 
ary as an input FIFO or output FIFO does, but only 
on double word boundaries. The bitstream moves 
from the least significant bit to the most significant 
bit of a double word and then to the least significant 
bit of the next double word (at the next higher ad- 



The statistical decoder buffers one quad word read 
from VRAM so that the decoding of bits in one 32-bit 
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dress location). For the x-bits, the first x-bit read 
from the bitstream becomes the most significant bit 
of the x-bit field when it is interpreted as a binary 
number. The example below shows a code defini- 
tion, a bitstream stored in VRAM, and the resulting 
decoded values. 

The code definition and range of values for each 
symbol length are indicated in Table 2-9. 

Table 2-9. VRAM Bitstream Decode Values 



Symbol 


Values 


Comments 










10x 


1,2 


100 = 1,101 = 2 


110xx 


3-6 


11000 = 3... .,11011 =6 


moxxx 


7-14 


1110000 = 7 1110111 = 14 



Decoding starts at address in this example. The 
two double words at addresses and 1 are: 

0:0xAC98E14D 

1:0x372E74CB 

The bitstream in VRAM, with colons dividing the 
symbols (read from right to left starting at LSB of 
address 0) is shown in Figure 2-5. 

Table 2-10 lists the symbols, in the order they are 
encountered in the bitstream, and the corresponding 
decoded values. 



Table 2-10. Decoding Symbols — 


Symbol 


Value 


Comments 


101 


2 


Starts at LSB, 
Address 0, 
Scanning Left 


100 


1 




101 


2 




























Q 







1110001 


8 




100 


1 




100 


1 




11010 


5 




1110100 


11 


Spans First and 
Second Double Word 


11001 


4 







6 




1110011 


10 




101 


2 




















1110110 


13 











Address MSB «<- 



Read bitstream from LSB to MSB ^r 



LSB 



Start 
1 :<M 01 1 :001 :_0 1 : 100 01 1 1 : :0 :0 :0 :U) 1 :001 :J_0 1<" Here 

First bit of a symbol, continued at LSB of next double word 



1 0:0110111:0:0:101: 1 1001 1 1 : 0: 10011:001011 



240854-5 



Figure 2-5. VRAM Bitstream Decoding Addresses 
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Figure 2-6. Pixel Interpolation 

Pixel Interpolator 

[ Pixint-c, Pixint] 

The pixel interpolator performs bilinear interpolation 
on four 8-bit pixels to generate, in effect, a pixel 
shifted by a fraction of a pixel position. See Figure 
2-6. If the four pixels have values of A, B, C, and D; 
and the horizontal weight and vertical weight are h 
and v, respectively, the interpolated value W, ignor- 
ing any quantization effects, is given by: 

W = A*(1-h)(1-v) + B*h(1-v) + C*(1-h)v + D*hv 

The values of h and v are even multiples of 1/16. 
Figure 2-6 illustrates pixel interpolation with an h 
weight of 6/16 or 3/8 and a v weight of 10/16 or 
5/8. 

The pixel interpolar can operate in two modes: se- 
quential-2D and random-2D. Sequential-2D mode is 
used for motion video decoding and when an array 
of pixels are interpolated with a common weighting. 
Random-2D mode is used either when the pixel ar- 
rays to be interpolated are not adjacent pixels in two 
rows or when the weight is changed for each inter- 
polation. (The word random is used here to mean 
non-sequential.) 



The example in Figure 2-7 shows a single row of 
pixels being interpolated in Sequential-2D mode us- 
ing two rows from the original (source) bitmap. The h 
and v weighting are constant for all the interpolated 
pixels. In this case, the weights appear to be approx- 
imately h = 10/16 and v = 6/16. 



A B E F I ... 

W X Y Z ... 

C D G H K ... 



— First Input Row 
— Interpolated Row 
— Second Input Row 



Figure 2-7. Sequential-2D Pixel Interpolation 

The pixel interpolator is pipelined and requires some 
startup sequence to fill the pipeline. Once filled, the 
pixel interpolator generates a new interpolated pixel 
every two T-cycles when in Sequential-2D mode. 
Source pixels are written into the interpolator as pix- 
el pairs. In the case above, the pixel pair BA would 
be written first, followed by the pixel pair DC. It would 
seem more natural to refer to the pixel pair as AB, 
but because of the way 8-bit pixels are arranged in 
16-bit words in VRAM, the left-most pixel on the 
screen is the least significant byte position. For ex- 
ample, if pixel A had a hex value of OxAA and B had 
a value of OxBB, the 16-bit word containing pixels A 
and B would have a value of OxBBAA. 

Then, two pixels are read from the interpolator. Be- 
cause the pipeline isn't full yet, these pixels are read 
and discarded. This loop of writing two pixel pairs 
and reading two output pixels continues four times. 
The two pixels that are read this fourth time are the 
first two valid output pixels: W and X. The interpola- 
tor may also collect output (interpolated) pixels into 
pixel pairs. For exmple, pixels W arid X, instead of 
being output separately, would be combined into a 
16-bit pixel pair XW. Since there are two possible 
phase relationships between the input pixel pairs 
and output pixel pairs, the desired phasing (either X 
and W paired or Y and X paired) can be specified. 
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bits 



11 



10 



8 



7:4 



3:0 



15 14 13 12 

RESERVED— Write as ZERO 

"Pipelining Select (1 = Fast, = Standard) 

"Phase (0 = In Phase, 1 = Opposite Phase) 
"RESERVED— Write as ZERO 

"Pairing (1 = Output Pixel Pairs, = Single Pixels) 
"Reset Bit (1 = Reset, = Normal) 

Mode Select Bits " " 

Vertical Weight " 

Horizontal Weight 



Figure 2-8. Pixel Interpolator Control Register 



Random-2D interpolation is used either when the 
pixels to be interpolated are not in horizontal rows or 
when the weight is changed for each interpolated 
pixel. Examples for this are smooth warping or 
smooth scaling operations. In the case of Random- 
2D, the processing for successive interpolated pix- 
els can not take advantage of pipelining; each pixel 
is considered to be the first pixel of a Sequential 
mode interpolation. The weight and the two input 
pixel-pairs are written into the interpolator. After 
waiting at least 10 T-cycles, the one interpolated pix- 
el can be read. (The delay is 10 cycles when in the 
standard mode (bit 14 = 0) and 6 T-cycles when in 
the fast mode (bit 14 = 1).) Then, the next two input 
pixel-pairs and if necessary, the new weight value, 
are written, and 1 cycles later the next interpolated 
pixel can be read. 

The h and v weight values, the mode selection, and 
other control bits are written to the pixel interpolator 
control register (avg-c). The bit assignment for this 
register is in Figure 2-8. The least significant byte 
holds the 4-bit v value (bits 7:4) and the 4-bit h value 
(bits 3:0). 

NOTE: 



The values used for h and v here are numerators 
of the fraction where the implied denominator is 
16. 



MODE SELECT 

Bits 8 and 9 are used to select on of four operating 
modes, of which only two are presently defined. 
These modes are given in Table 2-11. 

Table 2-11. Mode Select Operating Modes 



Bits 9:8 


Mode 


00 


RANDOM-2D 


01 


Sequential-2D 


10 


RESERVED 


11 


RESERVED 



RESET 

Writing a ONE to bit 10 resets the pixel interpolator. 
The pixel interpolator must be reset prior to chang- 
ing modes. 



PAIRING 

A ZERO in bit 11 causes the pixel interpolator to 
output individual pixels. A ONE causes the interpola- 
tor to collect adjacent pixels (in Sequential-2D 
mode) into 16-bit pixel pairs. This feature assists in 
motion video decoding, when combined with the 
ALU's dual-add-with-saturate operation, by allowing 
two pixels to be processed each cycle. The phasing 
used in collecting the pixel pairs is determined by the 
Phase bit described below. 



PHASE 

When output pixels are collected into pixel pairs, 
there are two possible alignments of the input pixel 
pairs to the output pixel pairs. The Phase bit (bit 13) 
selects the alignment to be used, based on the rela- 
tive word alignment of the source and destination 
bitmaps in VRAM. When the Phase bit is set to a 
ZERO, this indicates that the bitmaps are in-phase. 
In this case, the first two output pixels are grouped 
into one 16-bit pixel pair (with the first pixel in the 
least significant byte). When the Phase bit is set to a 
ONE, the bitmaps are out-of-phase. In this case, the 
first pixel is placed in the most significant byte of the 
first pixel pair, with invalid data in the least significant 
byte, and the second and third output pixels are col- 
lected into the second pixel pair. This is illustrated in 
Figure 2-9. 



PIPELINING 

A ZERO in bit 14 causes the pixel interpolator to use 
the standard amount of pipeline delay. A ONE in this 
field will select the fast mode that has less pipeline 
delay. Table 2-1 2 shows the pipelining delay for both 
modes. Note that the effect of the phase bit is to add 
an extra pixel delay. 
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In-Phase: 



A 


B 






W 


X 


C 


D 




Out-of-Phase: 
A 


B 






?? 


w 


C 


n 





E 


F 




Y 


G 


H 


E 


F 




X 


G 


H 



I J 1 st Row of Input Pixels Pairs 

Output Pixel Pairs 
K L 2nd Row of Input Pixel Pairs 

I J 1 st Row of Input Pixels Pairs 

Z ?? Output Pixel Pairs 



K_ 



2nd Row of Input Pixel Pairs 




Figure 2-9. Pixel Pair Phases 



Table 2-12. Pipelining Delay for 
Sequential-2D NON-PAIR Mode 



Pipelining 

Bit 

(Bit 14) 


Phase 

Bit 
(Bit 13) 


Pipeline Delay 

in Output 

Pixels 








6 





1 


7 


1 





2 


1 


1 


3 



When in PAIR mode (with bit 11 = one), the amount 
of pixel delay does not change, but half as many 
reads and writes are required to fill the pipeline be- 
cause each read or write of the averager transfers 
two pixels. For example, when in the standard mode 
(bit 14 = 0), with zero phase (bit 13 = 0) and pair 
mode (bit 11 = 1), three indeterminate pixel pairs 
must be read before the first good pixel pair is read. 
In the same case but with the phase bit = 1, the 
fourth pixel pair read contains one good pixel and 
one indeterminate pixel, and the fifth pixel pair read 
contains two good pixels. 

RESERVED 

Bits 15 and 12 are reserved for future use. Write 
ZEROs into these bit positions. 



Signature Register 

( hwid) 

The signature register can be read either by the host 
CPU or by microcode to determine the version of the 
82750PB. The value of the signature register can be 
used to distinguish between the 82750PB in the 



82750PA emulation mode, and the 82750PB in na- 
tive mode. The currently defined signature values 
given in Table 2-13. 

Table 2-13. Signature Values 



Value 


Definition 


OxFFFE 


The 82750PB Emulating the 82750PA 


OXFFFC 


The 82750PB in Native Mode 



All other signature values are presently undefined 
but may be used in the future to denote other ver- 
sions of the 82750 architecture. 



Display Format Registers 

[yeven, yodd, vu, vptr) 

The 82750PB's processor can write to the display 
registers in the VRAM interface. These registers are 
pointers and pitch values that address display bit- 
maps and 82750DB register loads in VRAM. Point- 
ers are 32-bit values that specify the specify the 
starting byte address of a bitmap or register load 
within a 4 GByte address space. The bottom two 
address bits are ignored since display bitmaps and 
register loads must start on a double word boundary. 
Therefore, the internal representation of a pointer is 
a 30-bit value. The pitch value associated with each 
pointer indicates the number of bytes between the 
start of two lines of a display bitmap or between the 
start of two register loads. The pitch is a single 1 6-bit 
value with its two least significant bits ignored, since 
the pitch must be an integer number of double 
words. Currently, there is also a restriction in the 
82750DB limiting all display bitmap pitches to pow- 
ers of two; so, the maximum display bitmap pitch is 
+ 214 Bytes = ±16 kBytes. The display registers 
are described in Table 2-14. 
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Table 2-14. Display Registers 


Register 


Description 


yeven-lo, hi 


This register pair points to the start of the Y bitmap or main bitmap that 
is to be displayed during an even field scan. 


yodd-lo, hi 


This register pair points to the start of the Y bitmap or main bitmap that 
is to be displayed during the odd field scan. 


ypitch 


The value in this register is added to the current Y bitmap pointer value 
each time a Y transfer is performed. 


vu-lo, hi 


This register pair points to the start of the VU bitmap. This bitmap is 
read to generate the VU values for both odd and even field scans. 


vupitch 


This value is added to the current VU bitmap pointer value each time a 
VU transfer is performed. 


vptr-lo, hi 


This register pair points to the start of a series of 82750DB register 
loads stored in VRAM. 


vpitch 


This value is added to the current 82750DB register load pointer each 
time a 82750DB register load is performed. The pitch is equal to the 
number of bytes from the start of one register load to the start of the 
next register load. 



3.0 HARDWARE INTERFACE 



VRAM Interface 

The VRAM interface performs the following opera- 
tions: 

• Maintains VRAM pointers for the two input FIFOs, 
the two output FIFOs, the statistical decoder, the 
Y (main) bitmap, the VU bitmap, and the 
82750DB register load. 

• Decodes VBUS codes and takes appropriate ac- 
tions such as generating a transfer cycle, sched- 
uling refresh cycles, or generating interrupt condi- 
tions. 



Arbitrates VRAM accesses between the two input 
FIFOs, the two output FIFOs, the statistical de- 
coder, the transfer request logic, the VRAM re- 
fresh logic, and the external VRAM access logic. 

During a memory cycle, performs appropriate ad- 
dress arithmetic on the VRAM pointer used for 
that memory cycle. 

As a result of certain VBUS codes, performs a 
shadow copy that consists of copying display-re- 
lated VRAM pointer values from shadow registers 
(that are loaded by the host CPU or the micro- 
code processor) to working registers where the 
various pointers are used for transfer cycles 
when the 82750DB is refreshing the display 
screen. 



Table 3-1. VRAM Interface Signals 



Signal 


Description 


MREQ# 


MEMORY REQUEST is asserted during the first cycle of a VRAM 
memory access. 


TRNFR# 


The TRANSFER output indicates the current memory cycle is a result 
of a 82750DB transfer request. 


RFSM# 


The REFRESH output indicates the current memory cycle is a result of 
a 82750DB refresh request. 


NXTFST# 


The NEXT FAST output indicates the next memory access will use the 
same row address as the current memory access. This facilitates the 
use of page mode memory accesses. . - 


MRDY# 


The MEMORY READY input indicates the availability of valid data on 
the'D[31:0] pins. 
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VRAM ACCESSES 

The 82750PB can initiate five different types of 
memory accesses: FIFO read, FIFO write, transfer 
read, transfer write, and refresh. In addition, the 
82750PB supports VRAM accesses by external log- 
ic. During an external access VRAM cycle, the 
82750PB tri-states its VRAM address and data bus- 
es and performs a host VRAM read or host VRAM 
write cycle. There is another operation performed by 
the 82750PB, a shadow copy, that is not a VRAM 
cycle but is arbitrated as though it were, since no 
VRAM cycles can take place during a shadow copy. 

The seven types of VRAM cycles initiated by the 
82750PB, including host VRAM read and host 
VRAM write, begin with the 82750PB asserting a 
combination of its three VRAM cycle definition out- 
puts: TRNFR#, RFSH#, and WE#. External logic 
detects the state of these signals, validated by 
MREQ#, and produces the appropriate sequence of 
VRAM control signals (RAS, CAS, etc.) to perform 
the type of memory cycle the 82750PB has request- 
ed. The 82750PB requires that each of these VRAM 
cycles take a minimum of two T-cycles, or T-states, 
denoted T1 and T2. External logic can insert addi- 
tional T2 states in order to stretch the VRAM cycle 
to more than two T-cycles. The start of a new VRAM 
access cycle is signaled by the assertion of MREQ# 
for the first T-cycle, T1. The VRAM access cycle 



definition signals, TRNFR#, RFSH#, and WE#, are 
asserted at the start of T1 and remain asserted until 
the end of the last T2. Other VRAM operations can 
be described similarly by sequences of T-states. Re- 
fer to Figure 3-4 and 3-5 on page 42 for timing dia- 
grams. 

Table 3-2 defines the states used for all VRAM ac- 
cess operations. A state diagram for the VRAM/ 
Host Interface is provided in Figure 3-1. This dia- 
gram includes the FIFO access states 

Table 3-2. 82750PB VRAM Access States 



State 


Description 


Ti 


Idle State, No VRAM Activity 


T1.TF1 


First State of a VRAM FIFO Cycle 


T2, TF2 


Last State of a VRAM FIFO Cycle 


TSC 


The T-State required to perform a 
shadow copy 


TTX1 


First State of a VRAM Transfer Cycle 


TTX2 


Last State of a VRAM Transfer Cycle 


TRF1 


First State of a VRAM Refresh Cycle 


TRF2 


Last State of a VRAM Refresh Cycle 



FIFO ACCESS 



HOST ACCESS 




240854-7 



Figure 3-1. Access State Diagram 
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Note that during successive VRAM cycles it is not 
necessary to go back to the idle state, Ti, between 
each cycle; the Tp2 state can be followed directly by 
a T1 state, starting at the next VRAM cycle. This 
results in efficient utilization of the 82750PB/VRAM 
bandwidth by allowing a VRAM cycle time of 2 
T-states. 



FAST VRAM CYCLES 

When the 82750PB performs Data Read or Data 
Write VRAM cycles for the input or output FIFOs, it 
performs two 32-bit accesses to read or write one 
64-bit value. These accesses are always performed 
in a sequence of EvenAddress followed by EvenAd- 
dress + 1 , which guarantees both that the two se- 
quential accesses will be in opposite banks and that 
the two accesses will be within the same VRAM 
page. This allows external logic to use either bank- 
interleaving or a page-mode access to complete the 
second access of the sequence and improve the 
VRAM bandwidth. However, the second access 
does not need to be handled differently from the 
first. Except for the assertion of the NXTFST# sig- 
nal, both accesses are treated as standard VRAM 
accesses. External logic can ignore the NXTFSTf 
signal, though, and treat the two accesses as two 
normal data read or data write cycles. Note that 
NXTFST# is not asserted for transfer, refresh, or 
host memory accesses. 



The NXTFST# output signal is provided for cases 
when external logic can generate a faster access for 
the second access of the two sequential accesses. 
During such a pair of accesses, NXTFST# is assert- 
ed during the first of the two accesses in order to 
provide sufficient time for the external logic to gener- 
ate the appropriate fast memory cycle for the sec- 
ond access. Refer to the timing diagrams in Figures 
3-4 and 3-5 (page 42) for examples illustrating the 
use of the NXTFST# signal. 

VBUS CODES 

Transfer request, interrupt, and synchronization 
codes are sent over. the BUS from the 82750DB to 
the 82750PB. The codes recognized by the 
82750PB are listed in Table 3-3, along with the ac- 
tions taken by the 82750PB as a result of receiving 
each code. Codes that cause TRANSFER cycles 
must be asserted for at least two clock cycles of the 
82750PB to insure that, in the worst case, the 
82750PB completes the transfer cycle before the 
code is released and the 82750DB starts shifting 
data from the VRAM shift registers. Other codes 
must also be asserted for a minimum of two 
82750PB clock cycles. Only the codes given in the 
Table 3-3 are valid codes for the VBUS. Other codes 
are reserved for future use and should not be used. 
Once a transfer cycle code is sent to the 82750PB, 
any non-transfer code may be sent immediately. A 
subsequent transfer cycle code should be sent only 
after the current transfer cycle is completed. 
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Table 3-3. VBUS Codes 


Binary 


Name 


Action 


0000 


YBMX 


TXRD Cycle Using Yc; Yc = Yc + Yp* 


0001 


VUBMX 


TXRD Cycle Using VUc; VUc = VUc + VUp 


0010 


REGX 


TXRD Cycle Using Vc; Vc = Vc + Vp 


0011 


WRDIGX 


TXWR Cycle Using Yc; Yc = Yc + Yp 


0100 


YNPBMX 


TXRD Cycle Using Yc; Yc = Yc 


0101 


Reserved 


Reserved 


0110 


Reserved 


Reserved 


0111 


WRDIGNPX 


TXWR Cycle Using Yc; Yc = Yc 


1000 


DFL 


DFL Int; Shadow Copy** 


1001 


82750DBSD 


82750DB Shutdown Interrupt 


1010 


REFRESH 


Schedule N Refresh Cycles 


1011 


Reserved 


Reserved 


1100 


VODD 


VBI Int; OF Int; Shadow Copy Odd; Hline = 0*** 


1101 


VEVEN 


VBI Int; EF Int; Shadow Copy Even 


1110 


HLINE 


lcnt+ + (Increment Line Counter) 


1111 


NULL 


No Action 




NOTES: 

*Yc— Y bitmap pointer, current; Yp— Y bitmap pitch; VU— VU bitmap; V— 82750DB register load. 
**Shadow Copy with Yc = Y-start-odd in odd field; Yc = Y-start-even in even field. 
***Hline— Horizontal Line Counter. 



PRIORITY 

Each time the VRAM state machine completes a 
VRAM operation and returns to the Ti state, it exam- 
ines all pending VRAM access requests and selects 
the highest priority request for the next VRAM oper- 
ation. The priority ordering of these requests are list- 
ed in Table 3-4. 

Table 3-4. Priority of VRAM Operations 



Request Type 


Priority 


Transfer Cycle 


Highest 


Shadow Copy 


• 


Host Access 


• 


VRAM Refresh 


• 


FIFO Read/Write 


Lowest 



NOTE: 

The shadow copy is treated as a VRAM operation even 
though it does not result in an access to VRAM. 

The VRAM refresh operation is placed low on the 
priority list to reduce the latency in servicing transfer 
requests and external VRAM requests. Since a sin- 



gle REFRESH code from the 82750DB schedules a 
number of refresh cycles, a higher priority for refresh 
would cause all the refresh cycles to occur in a burst 
that would lock out all lower priority requests until all 
refresh cycles completed. Instead, the following 
restriction applies to all request types with higher 
priority than refresh: high priority requests, such as 
transfer cycles, shadow copies, and external VRAM 
access must occur infrequently enough to allow 
proper refresh of the VRAM chips. Transfer cycles 
and shadow copies, by their nature, occur infre- 
quently so they are not generally a problem. 

There is a separate priority scheme for the five FIFO 
channels. The scheme used is rotating priority with 
automatic override and single cycle arbitration. Ro- 
tating priority means that the priority is assigned in a 
fixed cyclic order with the lowest priority given to the 
FIFO channel that "won" the last FIFO access. 
There is only one level of memory , so the order that 
requests arrive is not a factor in the arbitration. The 
cyclic order is given in Figure 3-2. 

As an example, if input FIFO (abbreviated ifO) was 
the last channel to perform a cycle, the priority order 
for the next FIFO access (from highest to lowest) 
would be: if1, sd, ofO, of1, and ifO. 



1-85 



in i^f , 



82750PB 



Automatic override that the rotating cyclic priority 
can be bypassed if there is an URGENT condition 
for one of the channels. A channel is urgent if the 
microcode processor is frozen because the proces- 
sor is waiting for that channel to be ready. The chan- 
nel can be either an input channel that is empty or 
an output channel that is full. In this case, the urgent 
channel gets the next available cycle. However, the 
priority will still be lower than non-FIFO requests, 
such as refresh cycles. 

Single clock cycle arbitration means that the selec- 
tion of the next channel that will get an access oc- 
curs in a single T-cycle or T-state, either in a Ti state 
or during the last T2 state of the previous VRAM 
cycle. 

VRAM POINTERS 

The VRAM interface maintains VRAM pointers for 
the FIFOs, as well as display-related pointers for the 
82750DB. Internally each pointer or address is 
stored as a 30-bit value addressing a double word in 
VRAM. The pointer values are read and written as 
two 16-bit words representing a 32-bit byte address 
(refer to the Figure 3-3). With a 30-bit double word 
address, the 82750PB can decode a VRAM address 
space of 1G double words or 4 GBytes. 

Input and output FIFOs can address down to a sin- 
gle word or byte in VRAM. A FIFO's pointer is post- 
incremented or post-decremented in parallel with its 
VRAM read or write cycle. 

The statistical decoder can only start decoding bit- 
streams on double word boundaries in VRAM and 
can only increment through VRAM. The decoder's 
pointer is post-incremented in parallel with each of 
its VRAM read cycles. 

Display-related pointers are updated by adding a 
pitch value to the current value during the corre- 
sponding transfer cycle. 



If a VRAM pointer appears on the B-Bus as source 
or as a destination then the following rules apply: 

Rulel 

If a B-Bus destination refers to an address that is 
both Even and > 0x1 f, then the source is restricted 
to "-lo" pointers if the source refers to a pointer. 

Rule 2 

If a B-Bus destination refers to an address that is 
both Odd and >0x1f, then the source is restricted to 
"-hi" pointers if the source refers to a pointer. 

SHADOW COPY 

When a VODD, VEVEN, or DFL code is received 
from the 82750DB over the VBUS, a shadow copy is 
scheduled. The actual shadow copy will occur as 
soon as the priority logic allows. Any VRAM access 
in progress must complete and a pending transfer 
cycle, if any, must be performed before the shadow 
copy can start. During the operation, shadow regis- 
ters for the Y-START, Y-PITCH, VU-START, VU- 
PITCH, 82750DB-START, and 82750DB-PITCH are 
copied into the corresponding working registers. 
During display refresh, the address arithmetic is per- 
formed on the working registers. The shadow regis- 
ters can be loaded by the host CPU or by a micro- 
code routine with less critical timing constraints, and 
then copied instantly by a shadow copy with it is time 
to update the registers, either prior to the next field 
or during the active display for split screen effects. 



inFIFOI • 



inFIFOO • 



outFIFOI 



outFIFOO ■ 



Statistical Decoder 



Figure 3-2. Cyclic Ordering of FIFOs 



31 30 29 24 23 16 15 .. 32 10 

<- - VRAM Address 30 bits - -■-•-■- -'-■-- - - - > 

Byte Address within Double-Word '..<---> 

<- Most Sig. Word of VRAM Address. -> | <- Least Sig.Wd. of VRAM Addr. -> 

Figure 3-3. VRAM Addressing 
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There are actually two shadow registers for Y- 
START. One for start of odd fields and one for start 
of even fields. A VODD code causes Y-START-ODD 
to be copied into the working register Y-CURRENT. 
Similarly, a VEVEN code causes the Y-START- 
EVEN to be copied into Y-CURRENT. A DFL code 
causes the Y-START-ODD value to be copied if the 
most recent start of field code received is a VODD, 
or a Y-START-EVEN value if the most recent start of 
field code was a VEVEN. This allows a simple inter- 
laced or non-interleaced display to be refreshed with 
no host CPU intervention. For more complex dis- 
plays, such as split screens, the host CPU must up- 
date the shadow registers prior to each shadow 
copy. A shadow copy operation requires 2 T-cycles. 



Host Interface 

The Host Interface provides the following functions: 

© Arbitrates host CPU and 82750PB access to 
VRAM. 

• Provides the host access to external devices. 

o Provides the host access to 82750PB internal 
registers and memories. 

Signals specific to the Host Interface are listed in 
Table 3-5. 




Table 3-5. Host Interface Signals 



Signal 


Description 


HREQ# 


HOST REQUEST: Asynchronous request from the host for all types of 
host access. Used both to request and release system buses. 


HREG# 


HOST REGISTER: Single-ranked control to request host access to 
82750PB internal registers in concert with HRAM # . 


HRAM# 


HOST VRAM: Single-ranked control to request host access to VRAM in 
concert with HREG#. 


HALEN# 


HOST ADDRESS LATCH ENABLE: Asynchronous status from the host 
indicating the presence of valid address, write enable (transaction 
direction control), and the byte enables at the interface of the 82750PB. 


HBUSEN# 


HOST BUS ENABLE: 82750PB synchronous status granting the host 
access to the address, write enable, data bus, and byte enables at the 
interface of the 82750PB. 


HRDY# 


HOST READY: 82750PB synchronous status to the host indicating the 
presence of valid data appearing at the 82750PB's databus for VRAM 
and register accesses and optionally for external accesses. 


HINT# 


HOST INTERRUPT: 82750PB synchronous interrupt to the host, set 
under direct or indirect microprogram control. 


Signals common to the host, VRAM, and external device interfaces are listed in Table 3-6. 
Table 3-6. Host, VRAM, and External Device Interfaces 


Signal 


Description 


A[31:2] 


ADDRESS BUS: System address bus used to select unique VRAM, the 
82750PB register, and external device locations that will be accessed 
under host control. The lower seven bits A[8:2] are bidirectional and are 
used during register accesses 


D[31:0] 


DATA BUS: Bidirectional system data bus used to transfer data to and 
from all sources and destinations. When transferring 1 6-bit host register 
values, the data bus MSH and LSH will both carry identical values. 


WE# 


WRITE ENABLE: Bidirectional, single-ranked signal used to determine 
the data transfer direction. When active during host register cycles, data 
flows from the host to an 82750PB destination. During host VRAM cycles, 
WE# active will define the data direction to be from the host to VRAM. 


BE[3:0]# 


BYTE ENABLE: Bidirectional signals used to select the bytes that will be 
modified during data transactions. All host register transactions are 
performed 1 6 bits at a time, while VRAM may be modified 8 bits at a time. 
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As with VRAM operations, host operations are described through a sequence of T-states. Table 3-7 defines 
the T-states used to implement all host transactions with VRAM, external devices, and the 82750PB. 

The master execution state diagram that defines the VRAM/Host transactions is provided in Figure 3-1. 

Table 3-7. 82750PB Host Transaction States 



State 


Description 


TA 


First state of any host transaction. Entry into TA will be granted after 
HREQ# has been asserted. During this state, the 82750PB will tri-state 
its address, data bus, write enable, and byte enable signals to provide a 
full cycle of "dead-band" before the assertion of HBUSEN # . In the state 
immediately following TA HBUSEN # will assert, allowing the host to drive 
the host buses. 


TB 


First cycle in which the host is granted bus access for register or VRAM 
transactions. The sequencer will remain in TB until HALEN# is received, 
indicating that the address write enable and byte enable signals are 
stable at the 82750PB pins. 


TC1 


First cycle that output data is valid. 


TCn 


This state is entered to wait for the completion of the current host cycle. 
The cycle is defined as complete when HREQ# deasserts. HRDY# is 
asserted along with valid data until the transition to state TD occurs. 


TD 


The last cycle of a host transaction. HBUSEN # is deasserted allowing 
one dead-band cycle to allow control of the address, data, write enable, 
and byte enable signals to be returned to the 82750PB. 


TV1 


First cycle of a Host VRAM transaction. Memory is requested and is 
followed by a transition to TV2. 


TV2 


Last cycle of a Host VRAM transaction. The sequencer will remain in TV2 
until MRDY# is received. 



A single stage of input synchronization is employed 
for HREG#, HRAM#, WE#, and BE[0]#, while 
HREQ# and HALEN# are programmable to have 
one or two stages by bit 12 of the Microcode Proc- 
essor Control Register. See Table 3-10. T-state tran- 
sitions are caused by the synchronized versions of 
these signals. 

The synchronized versions of HREG# and HRAM# 
must be stable before entry into T-state TA. The 
synchronized versions of WE#, BE[0]#, and 
HALEN# should be stable before exiting T-State 
TB. Once asserted, all of the above signals should 
remain stable until the deassertion of HBUSEN #. 

The type of host cycle to perform is determined by 
the states of HREG# and HRAM# as indicated in 
Table 3-8. 



HOST REGISTER ACCESS 

The host has access to the 82750PB's internal reg- 
isters and memories to monitor and control the oper- 
ation of the microcode processor, provide a means 
of debugging microprogram routines, and to function 
as the primary test port for production testing. 

Register access is initiated by the host asserting 
HREQ#, HREG#, and HRAM# as shown in Table 
3-8 and in the timing diagrams on pages 42 through 
45. After the host has been granted bus access by 
an active HBUSEN # in state TB, the address, write 
enable, and byte enables may be driven. After these 
signals have stabilized HALEN# is asserted, en- 
abling a read or a write operation to occur. 



Table 3-8. Host Cycle Types 



HREG# 


HRAM# 


Host Cycle 
Type 


1 


1 


External 





1 


Register 


1 





VRAM 








Reserved 
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In the case of a register read, state TC1 is entered 
and the data bus is driven with the internal value. 
One cycle later, a transition to state TC occurs, and 
HRDY# activates, signaling the presence of stabi- 
lized data at the 82750PB data pins. This state (TC) 
will be maintained until the host deasserts HREQ#, 
signaling the completion of the cycle that caused a 
transition to state TD. 

In the case of a register write, TC1 is again entered 
(from TB), but the data bus may now be driven by 
the host. (During host cycles, data bus drive activity 
is indirectly controlled by WE# and an additional 
dead-band is provided by entry into state TC1 to al- 
low for internal WE# stabilization.) Stable data at 
the 82750PB interface, as well as the completion of 
the write cycle, is signaled by the deassertion of 
HREQ#. As with reads, the deactivation of HRDY# 
signals the transition to state TD. 

As state TD is entered, HRDY# and HBUSEN# 
deassert, the address data, write enable, and byte 
enables tri-state, and bus control is returned to the 
82750PB in the following cycle. 

HOST VRAM ACCESS 

Because the 82750PB is so closely coupled with 
VRAM, host accesses to VRAM are arbitrated and 
controlled by the 82750PB. VRAM access is initiated 
by the host asserting HREQ#, HREG#, and 
HRAM# as shown in the Host Cycle Table above 
and in the timing diagrams on pages 42 through 45. 
After the host has been granted bus access by an 
active HBUSEN#, the address, write enable, and 
byte enables may then be driven. After these signals 
have stabilized at the memory devices (or longest 
relevant propagation path), HALEf\l# is asserted, 
enabling a read or a write operation to occur. 

Because VRAM will not drive the data bus until after 
a memory request, a transition into state TC1 to al- 
low for data bus direction stabilization is not re- 
quired. Instead, a transition to state TV1 occurs, 
which asserts MREQ# for a single cycle and is fol- 
lowed by a transition to TV2. TV2 will remain the 
current state until the reception of an active 
MRDY#. 

In the case of a VRAM read, the memory data bus 
will be driven during TV1, and valid data will appear 
in state TV2. Data will be guaranteed valid coinci- 
dent with the deassertion of MRDY# from memory. 

In the case of a VRAM write, the memory data bus is 
driven with valid data during TV1. Again the recep- 
tion of MRDY# will serve to indicate the completion 
of the memory operation. 



NOTE: 

The host device must be able to transmit or receive 
memory data in order to be valid at the trailing 
edge of MRDY# at the data's destination (memory 
or host). 

After MRDY# becomes active, a transition from TV2 
into TC1 is accomplished to allow time to propagate 
data to the host. TC is then entered to await the 
deassertion of HREQ# (if it has not already oc- 
curred). TD is then entered, duplicating the dead- 
banding previously described. 

HOST EXTERNAL ACCESS 

In addition to VRAM and register host access, an 
external device access mechanism is provided. Dur- 
ing this access, upon the receipt of HREQ# with 
HREQ# and HRAM# inactive, the 82750PB releas- 
es the address, data, write enable, and byte enables 
in state TA. 

The difference here is that state TC1 is directly en- 
tered from TA, thereby ignoring any transitions of 
HALEN#. Since the 82750PB also ignores the data 
bus direction control (write enable) the host and an 
external device may communicate unencumbered 
by the 82750PB. 

Entry into state TC directly follows TC1 in the ex- 
pected sequence and remains there until HREQ# is 
released. This is followed by entry into TD. 
HBUSEN# is asserted during the timing that TC1 
and TCN are active. 

During an external access, HRDY# is not asserted 
unless the external logic asserts MRDY# as shown 
in Figure 3-7. 

HOST REGISTER ADDRESS MAPPING 

Table 3-9 shows the host address mapping of the 
on-chip registers and memories, in terms of the off- 
set in bytes, from the base address for 82750PB 
accesses. Note that the 82750PB only supports 
word accesses to these registers. Therefore, the 
least significant bit of the byte offset should be set to 
zero. The 82750PB forms the register address from 
inputs on the A[31:2] pins and BE #[3:0] pins. The 
A[31:2] specify the double word address of the reg- 
ister, and combinations of the BE# pins determine 
which of the two words with the double word is being 
addressed. BE #[3:0] = 1 1OO2 selects the least sig- 
nificant word within a double word, and BE # [3:0] = 
0011 2 selects the most significant word within a 
double word. These are the only two valid patterns 
for BE# inputs during a host register access cycle. 
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Table 3-9. Host Address Mapping 



Byte 
Address 


Description 


0xOOO-OxO7E 
Ox080-OxOFE 
0x1 00-0x1 7E 
0x1 80-0x1 FE 


(a) A source and 
destination registers 

(b) B source and 
destination registers 

(c) Microcode processor control 
and status registers 

(d) VRAM pointer RAM 



NOTE: 



The host should only perform 16-bit word reads 
or writes to 82750PB registers. The 82750PB 
does not support byte reads or writes or double 
word reads or writes to on-chip registers. 



When the host CPU reads or writes to areas (a, b, or 
d) and the 82750PB is not already in a HALT state, 
the microcode processor is automatically HALTED 
for the one T-cycle actually required to complete the 
data transfer, and then the processor is restarted 
after the transfer is complete. If the 82750PB is in a 
HALT state when the host access is initiated, it will 
remain in the HALT state following the completion of 
the access. This is transparent to both the host CPU 
and the microcode processor. 



During an access to areas (a) or (b), bits 6:1 of the 
byte offset should be set to the source or destina- 
tion code for the register that will be read or written. 
The coding is the same as used in the microcode 
instruction word. Bit is always set to a zero. Refer 
to the 82750PB Source and Destination Coding 
Table found in Chapter 4. 

Area (c) contains one write-only register, the CON- 
TROL register, and two read-only registers, the IN- 
TERRUPT FLAG register and the microcode PROC- 
ESSOR STATUS register. The CONTROL register is 
used to halt or single-step the microcode processor, 
which enables or masks interrupts to the host CPU, 
selects the signal that is output via the PMON/FRZ 
pin, and enables or disables the 82750PA emulation 
mode. The bit assignments for the CONTROL regis- 
ter are given in Table 3-10. 

During reset of the 82750PB, the HALT bit is set to a 
one, the six Interrupt Enable bits are reset to zero, 
the Disable SYNC bit is set to zero, the PMON/FRZ 
bit is set to zero (so that the FRZ signal is output), 
and the Enable 82750PB bit is reset to zero (so that 
on reset, the 82750PB starts in a 82750PA emula- 
tion mode). 
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Table 3-10. Bit Assignments for Microcode Processor CONTROL 
Register (Write-Only, Byte Offset = 0x100) 


Bit 


Name 


Description 


BitO 


HALT 


1 = Microcode Processor Halt 
= Microcode Processor Run 


Bit 1 


SINGLE-STEP 


1 = Execute One Instruction and then Halt 
(Only when Already Halted, Bit = 1) 
= No Action 


Bit 2 


Enable MCINT 


1 = Enable Microcode Interrupts to Host CPU 
= Mask Microcode Interrupts 


Bit 3 


Enable VBI 


1 = Enable Vertical Blanking Interrupt to Host CPU 
= Mask Vertical Blanking Interrupt 


Bit 4 


Enable DFL 


1 = Enable DFL Interrupt to Host CPU 
= Mask DFL Interrupt 


Bit 5 


Enable SD 


1 = Enable 82750DB Shutdown Interrupt to Host 
= Mask SD Interrupt 


Bit 6 


Enable OFI 


1 = Enable Odd Field Interrupt 
= Mask OF Interrupt 


Bit 7 


Enable EFI 


1 = Enable Even Field Interrupt 
= Mask EF Interrupt 


Bits 8-11* 




1 = RESERVED; Write as Zeros 


Bit 12 


Disable SYNC 


1 = Disable Synchronizers for HREQ#/HALEN# 
= Enable Synchronizers for HREQ#/HALEN# 


Bit 13 


PMON/FRZ 


1 - Output FRZ # Signal on PMFRZ # Pin 
= Output PMON # Signal on PMFRZ # Pin 


Bit 14 




1 = RESERVED; Write as Zero 


Bit 15 


Enable 82750PB 


1 = Enable 82750PB Mode 

= Enable 82750PA Emulation Mode 




*AII other bits are reserved for future use, and should be written as zeros. 
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The INTERRUPT FLAG register holds a flag for 
each of the six interrupt sources. A flag bit is set to a 
one when the interrupt condition is detected (inde- 
pendent of the state of the corresponding Interrupt 
Enable/Mask bit in the CONTROL register), and all 
flags are cleared to zero each time the INTERRUPT 
FLAG register is read. If this register is read during 
the same cycle that an interrupt condition is detect- 
ed, the flag bit corresponding to that interrupt condi- 
tion will remain at a one. This new interrupt condition 
will then be seen by the host processor when it next 
reads the INTERRUPT FLAG register. The flag in- 
sures that an interrupt is not lost if it occurs at the 
same cycle that the INTERRUPT FLAG register is 
read (and reset). In addition, the Microcode Interrupt 
source has an overflow flag that indicates if more 
than one Microcode Interrupt has occurred since the 
Interrupt Flag register was last read. The bit assign- 
ments for the INTERRUPT FLAG register are listed 
in Table 3-11. 



The PROCESSOR STATUS register holds four 
status bits: HALT, FREEZE, PMON, and SYNC 
status. HALT indicates that the processor is HALT- 
ED due to a HALT bit in the CONTROL register be- 
ing set to a ONE or due to the HALT# pin being 
asserted. FREEZE indicates that the processor is 
waiting for one of the VRAM channels to become 
ready or is waiting for an access to the VRAM point- 
er RAM. PMON is a signal that can be toggled by a 
special ALU opcode or a special B source code. 
This signal can be used for performance monitoring 
of microcode. SYNC status bit indicates the pres- 
ence or absence of the internal synchronizers for 
HREQ# and HALEN# inputs. In addition, the Inter- 
rupt Mask bits that are written into the PROCESSOR 
CONTROL register can be read from this register. 
These mask bits are read in the same polarity that 
they are written, but note that the bit positions and 
bit ordering are not consistent with the PROCES- 
SOR CONTROL register. The bit assignments for 
this register are given in Table 3-12. 

Address mapping for areas (a), (b), and (d) are given 
in Tables 3-13 to 3-15. 



Table 3-11. Bit Assignments for INTERRUPT FLAG Register 
(Read-Only, Byte Offset = 0x100) 



Bit 


Description 


Bit 8:0 


Not Used, the State of These Bits Are Not Specified 


Bit 9 


EF Interrupt Flag 


Bit 10 


OF Interrupt Flag 


Bit 11 


MCINT Overflow Flag 


Bit 12 


82750DB Shutdown Interrupt 


Bit 13 


MCINT Microcode Interrupt 


Bit 14 


VBI Vertical Blanking Interrupt 


Bit 15 


DFL Display Format Load Interrupt 
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Table 3-12. Bit Assignments for PROCESSOR STATUS Register 
(Read-Only, Byte Offset = 0x102) 


Bit 


Description 


BitO 


HALT (1 = Halted, = Running) 


Bit 1 


FREEZE (1 = Frozen, = Running) 


Bit 2 


PMON (1 = Active, = Inactive) 


Bit 3 


Synchronizers on HREQ#/HALEN# (0 = Enabled, 1 = Disabled) 


Bit 9:4 


Not Used, the State of These Bits is Not Specified 


Bit 10 


MCINT Microcode Interrupt Mask 


Bit 11 


VBI Vertical Blanking Interrupt Mask 


Bit 12 


DFL Display Format Load Interrupt Mask 


Bit 13 


82750DB Shutdown Interrupt Mask 


Bit 14. 


OF Interrupt Mask 


Bit 15 


EF Interrupt Mask 
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Table 3-13. 82750PB A Bus Source/Destination Address Mapping 



Address (Hex) 


ADST 


ASRC 


0x000 


Null 


Null 


0x002 




hwid 


0x004 




cc 


0x006 


maddr 




0x008 




alu 


OxOOA 


cnt 


cnt 


OxOOC 


cnt2 


cnt2 


OxOOE 


lent 


lent 


0x010 


rO 


rO 


0x012 


r1 


rt 


0x014 


r2 


r2 


0x016 


r3 


r3 


0x018 


r4 


r4 


0x01 A 


r5 


r5 


0x01 C 


r6 


r6 


0x01 E 


xl 


xl 


0x020 


mcode3 


mcode3 


0x022 


mcode2 


mcode2 


0x024 


mcodel 


mcodel 


0x026 


pc 


pc 


0x028 


pixlnt-c 




0x02A 


pixint 


pixint 


0x02C 


*dram1 


* draml 


0x02E 


*dram2 


*dram2 


0x030 


*dram1 + + 


*dram1 + + 


0x032 


*dram2+ + 


*dram2+ + 


0x034 


*dram1 — 


*dram1 — 


0x036 


*dram2 


*dram2-- 


0x038 


draml 


draml 


0x03A 


dram2 


dram2 


0x03C 


dram3 


dram3 


0x03E 


dram4 


dram4 


0x040 


*0Ut1 


*in1 



Address (Hex) 


ADST 


ASRC 


0x042 


outl + .+ 


*in2 


0x044 


shift-hi 


*stat 


0x046 


out1-hi 


*stat# 


0x048 


*out2 




0x04A 


out2+ + 




0x04C 


shift-r 




0x04E 


out2-hi 




0x050 


0Ut1-C 




0x052 


in1-c 




0x054 


shift-l 




0x056 


in1 -hi 




0x058 


out2-c 




0x05A 


in2-c 




0x05C 






0x05E 


in2-hi 




0x060 


r8 


r8 


0x062 


r9 


r9 


0x064 


no 


r10 


0x066 


r11 


r11 


0x068 


M2 


r12 


0x06A 


r13 


r13 


0x06C 


r14 


r14 


0x06E 


r15 


r15 


0x070 


cc 


shift 


0x072 


fent 


fent 


0x074 


*dram3 


*dram3 


0x076 


*dram4 


*dram4 


0x078 


*dram3+ + 


*dram3+ + 


0x07A 


*dram4+ + 


*dram4+ + 


0x07C 


*dram3 — 


*dram3-- 


0x07E 


*dram4 — 


*dram4 — 
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Table 3-14. 82750PB B Bus Source/Destination Address Mapping 



Address (Hex) 


BDST 


BSRC 


0x080 


Null 


Null 


0x082 




alu 


0x084 


*dram3 


*dram3 


0x086 


*dram4 


*dram4 


0x088 


*dram3+ + 


*dram3+ + 


0x08A 


*dram4 + + 


*dram4 + + 


0x08C 


*dram3-- 


*dram3 — 


0x08E 


*dram4 — 


*dram4 — 


0x090 


rO 


rO 


0x092 


r1 


r1 


0x094 


r2 


r2 


0x096 


r3 


r3 


0x098 


r4 


r4 


0x09A 


r5 


r5 


0x09C 


r6 


r6 


0x09E 


x7 


r7 


OxOAO 


r8 


*in1 


0x0A2 


r9 


*in2 


0x0A4 


no 


*stat 


0x0A6 


M1 


*stat# 


0x0A8 


r12 


circbuf 


OxOAA 


r13 




OxOAC 


r14 




OxOAE 


M5 




OxOBO 


circbuf 


literal 


0x0B2 




literal 1 


0x0B4 


*dram1 


literal 2 


0x0B6 


*dram2 


literal 3 


0x0B8 


*dram1 + + 


literal 4 


OxOBA 


*dram2+ + 


literal 5 


OxOBC 


*dram1 — 


literal 6 


OxOBE 


*dram2-- 


literal 7 


OxOCO 


*0Ut1 


prof 



Address (Hex) 


BDST 


BSRC 


0x0C2 


outl + + 




0x0C4 


out1-lo 


out1-lo 


0x0C6 


out1-hi 


out1-hi 


0x0C8 


*out2 


stat-lo 


OxOCA 


out2+ + 


stat-hi 


OxOCC 


out2-lo 


out2-lo 


OxOCE 


out2-hi 


out2-hi 


OxODO 


0Ut1-C 


out1-c 


0x0D2 


in1-c 


in1-c 


0x0D4 


in1-lo 


in1-lo 


0x0D6 


in1-hi 


in1-hi 


0x0D8 


out2-c 


out2-c 


OxODA 


in2-c 


in2-c 


OxODC 


in2-lo 


in2-lo 


OxODE 


in2-hi 


in2-hi 


OxOEO 


stat-ram 


r8 


0x0E2 


stat-c 


r9 


0x0E4 


stat-lo 


no 


0x0E6 


stat-hi 


M1 


0x0E8 


yeven-lo 


M2 


OxOEA 


yeven-hi 


M3 


OxOEC 


yodd-lo 


M4 


OxOEE 


yodd-hi 


r15 


OxOFO 


ypitch 


shift 


0x0F2 




stat-c 


0x0F4 


vu-lo 


*dram1 


0x0F6 


vu-hi 


*dram2 


0x0F8 


vupitch 


*dram1 + + 


OxOFA 


vpitch 


*dram2+ + 


OxOFC 


vptr-lo 


*dram1 — 


OxOFE 


vptr-hi 


*dram2-- 
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Table 3-15. VRAM Pointer RAM Mapping 


Byte Address 


Name 


Description 


0x180 
0x182 


Yw-lo 
Yw-hi 


Working Copy of Y Pointer 


0x184 
0x186 


outl-lo 
out1-hi 


Output FIFO 1 Pointer 


0x188 


Yw-pitch 


Working Copy of Y Pitch 


0x18A 




RESERVED 


0x18C 
0x1 8E 


out2-lo 
out2-hi 


Output FIFO 2 Pointer 


0x190 
0x192 


VUw-lo 
VUw-hi 


Working Copy of VU Pointer 


0x194 
0x196 


in1-lo 
in1 -hi 


Input FIFO 1 Pointer 


0x198 


VUpitchw 


Working Copy of VU Pitch 


0x1 9A 


vpitchw 


Working Copy of 82750DB Pitch 


0x1 9C 
0x1 9E 


in2-lo 
in2-hi 


Input FIFO 2 Pointer 


0x1 A0 
0x1A2 


vptrw-lo 
vptrw-hi 


Working Copy of 82750DB Pointer 


0x1 A4 
0x1 A6 


stat-lo 
stat-hi 


Working Copy of Statistical Decoder Pointer 


0x1 A8 
0x1AA 


Yeven-lo 
Yeven-hi 


Shadow Copy of Y Start Even Pointer 


0x1 AC 
0x1 AE 


Yodd-lo 
Yodd-hi 


Shadow Copy of Y Start Odd Pointer 


0x1 BO 


Ypitch 


Shadow Copy of Y Pitch 


0x162 


rfcnt 


RFSH Cycles per RFSH Code from 82750DB 


0x1B4 
0x1 B6 


VU-lo 
VU-hi 


Shadow Copy of VU Start Pointer 


0x1 B8 


VUpitch 


Shadow Copy of VU Pitch 


0x1BA 


vpitch 


Shadow Copy of 82750DB Pitch 


0x1BC 
0x1BE 


vptr-lo 
vptr-hi 


Shadow Copy of 82750DB Pointer 



NOTE: Register rfont write only register and should never be read. 



Initializing the 82750PB 

The 82750PB is placed in a RESET state by assert- 
ing RESET# for at least ten T-cycles. In the RESET 
state, which continues until RESET # is released, all 
of the 82750PB's outputs are tri-stated for compati- 
bility with board test requirements. 

Proper initialization of the 82750PB requires that the 
82750PB is held in a RESET state by keeping RE- 
SET # active for at least 10 T-cycles, and then re- 



leasing RESET #. This is referred to as the INITIAL 
state. In the INITIAL state: 

© The microcode processor is halted. 

• All six interrupts are masked, and the interrupt 
latches are cleared. 

• The 82750PA/82750PB instruction format select 
bit is set to the 82750PA. 

• The VRAM interface is ready to service VRAM 
requests; however, none of the VRAM pointers 
are valid. 
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• The number of refresh cycles that will be generat- 
ed each time a RFSH code is received from the 
82750DB is set to 14 cycles. 

• All bidirectional I/O pins are tristated. 

After the 82750PB has been initialized, i.e., placed in 
the INITIAL state, but prior to releasing the 
82750DB's reset signal, the following operations 
must be performed: 

• Load the REFRESH-CYCLES-PER-LINE register 
with the appropriate value (the equation for the 
value is: VALUE = (2 N - 1), where N is the num- 
ber of cycles; for example, 5 refresh cycles would 
result in VALUE - 25 - 1 = 31 10 = 001 F 16 . 
The refresh register is 14 bits wide and the way it 
works is to generate one refresh everytime a right 
shift results in a '1' bit. It continues the right sifting 
until it finds a '0' bit and halts. Hence from program- 
ming point of view: 001 Fi6 = FFDF16 = 5 refresh 
cycles per line. 

© Load the shadow copies of Y, VU, and 82750DB 
pointers and pitches. 

• Load the appropriate 82750DB Register Load list 
into VRAM starting at the address pointed to by 
the 82750DB pointer. 

Prior to releasing the microcode processor from its 
HALTed state to run a microcode program, the fol- 
lowing operations must be performed: 

• If 82750PB code is to be executed, bit 1 5 of the 
82750PB CONTROL register must be set to a 
one. 

• Load a microcode program into microcode RAM 
on the 82750PB by writing to the three instruction 
word registers (mcodel - the most significant 
word of the instruction, mcode2, and 
mcode3 - the least significant word of the in- 
struction, the one containing the next address 
field) and then writing to maddr, the address in 
microcode RAM where the instruction will be 
loaded. 

•■ Load the PC with the address in microcode RAM 
of the first instruction to be executed. 

• Write to the 82750PB CONTROL register with the 
HALT bit (bit 0) set to zero, causing the processor 
to start executing an instruction sequence, or with 
the SINGLE-STEP bit (bit 1) set to a one (keeping 
HALT also set to one), causing the processor to 
execute a single instruction. 



both as external signals, multiplexed on a single out- 
put pin, and as bits in the Processor Status register. 
FRZ# is active for each T-cycle when the micro- 
code processor is frozen, waiting for access to 
VRAM or to the VRAM Pointer RAM. PMON# can 
be toggled by a special ALU opcode or a special B 
bus source code. This allows PMON# to be used to 
indicate what particular segment of microcode is be- 
ing execute. The PMON/FRZ bit in the Processor 
Control register selects the signal that is being out- 
put. 

Freezes may indicate that the microcode routine is 
not making the most efficient use of the input and 
output FIFO buffering. This is particularly important 
for the inner loops of graphics and video routines 
that are memory-bandwidth limited. Ideally, inner 
loops should be balanced so that the rate pixels are 
processed is equal to the rate that they can be read 
from and written to VRAM with no freezes. The buff- 
ering in the input and output FIFOs serve to make 
sequential reads and writes to VRAM more efficient 
by performing full 64-bit reads and writes, instead of 
individual 8-bit or 16-bit accesses. This has the ef- 
fect of averaging the VRAM read/write rate over a 
number of instruction times. For example, if the 
82750PB is performing a 64-bit read or write every 8 
T-cycles, for an average of 8 bits per T-cycle, a two 
instruction inner loop could read one 8-bit pixel and 
write one 8-bit pixel without any freezes occurring 
(assuming the source pixels and the destination pix- 
els are each sequential). 

The PMON# provides a more standard performance 
monitoring capability by indicating when a particular 
segment of microcode, bracketed by special instruc- 
tions that toggle the PMON# signal, is being exe- 
cuted. This allows either absolute execution-time 
measurement or measurement of the fraction of the 
total execution time that is required by the segment. 
Either the ALU opcode 'prof or the B bus source 
code 'prof will toggle the PMON signal. 

An external HALT pin is provided on the 82750PB to 
allow external debugging hardware to immediately 
halt the microcode processor. Activating this input 
causes the microcode processor to halt prior to exe- 
cuting the next instruction. When the processor is 
halted, the VRAM interface portion of the 82750PB 
continues to operate normally, performing transfer 
cycles, refresh cycles, and shadow copies as re- 
quested by the 82750DB. 




Performance Monitoring 

Two signals, FRZ# and PMON#, which are useful 
for microcode performance monitoring, are available 



Host/VRAM Timing Diagrams 

Figures 3-4 through 3-8 are Host/VRAM Timing Dia- 
grams. 
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Figure 3-4 VRAM Read and Write Cycles 
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Figure 3-5. VRAM Transfer and Refresh Cycles 
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Shaded areas indicate 
bidirectional signal is 
driven by host 



NOTES: 

1. MREQ#, RFSH#, TRNFR#, and NXTFST# remain inactive during Host Register Read and Write cycles. 

2. If HALEN#/HREQ# synchronizers are disabled then the second Ti and Tb states will be missing. 




Figure 3-6. Host Register Read and Write Cycles 
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NOTES: 

1. RFSH#, TRNFR#, and NXTFST# remain inactive during Host VRAM Read and Write cycles. 

2. If the Synchronizers on HREQ#/HALEN# is disabled, then the second Ti state will be missing. 
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4.0 MICROCODE INSTRUCTION 
FORMAT 

Overview 

The 82750PB executes two slightly different instruc- 
tion formats: one that is backward compatible with 
the 82750PA and another that allows full access to 
the microcode resources of the 82750PB. The 
82750PA/82750PB bit in the 82750PB processor 
control register determines which instruction format 
is in effect (see Chapter 3). On reset, the 82750PB is 
placed in 82750PA instruction format mode. In this 
mode the 82750PB will execute binary microcode 
originally assembled for the 82750PA in a manner 
that is functionally equivalent to the 82750PA. 

The following description applies to the 82750PB in- 
struction format. Exact definitions of 82750PB in- 
struction formats and field codings are shown in Fig- 
ure 4-2 and Table 4-5. 



Instruction Sequencing 

The instruction word for 82750PB's microcode proc- 
essor is 48 bits wide. The Microcode RAM holds 512 
instructions. Nine bits of each instruction specify the 
address of the next instruction to be executed. Each 
instruction fetch reads two instructions (of odd ad- 
dress and even address pair) using the upper eight 
bits of the 9-bit instruction address. Both the LSB of 
the instruction address and a Condition Flag bit, se- 
lected from eight possible branching conditions, are 
used to determine whether the next instruction to be 
executed is the even address instruction or odd ad- 
dress instruction, according to the logic table shown 
as Table 4-1. 

Table 4-1. Microcode Next Instruction Selection 



LSB of 
Address 


Condition 
Flag State 


Next 
Instruction 





(FALSE) 


EVEN 





1 (TRUE) 


EVEN 


1 


(FALSE) 


ODD 


1 


1 (TRUE) 


EVEN 



For an unconditional branch, the condition flag 
FALSE (which is always zero) is selected; this caus- 
es the LSB of the address to be passed through to 
select the next instruction: LSB = selects EVEN 
and LSB = 1 selects ODD. This allows uncondition- 
al branching to any of the 512 instructions in the 
RAM. For a conditional branch, the LSB of the ad- 
dress is set to a one; this causes the state of the 
condition flag to select the next instruction: FALSE 
selects the ODD instruction and TRUE selects the 
EVEN instruction. Therefore, a conditional branch 
jumps to either the odd or even instruction of an 
odd/even pair depending on the state of the condi- 
tion. 



Instruction Word Field Descriptions 

Each field of the microcode instruction format is de- 
scribed in the following sections. 



NADDR— NEXT INSTRUCTION ADDRESS FIELD 

This field holds the address of the next instruction to 
be executed/Taking advantage of the fact that the 
microcode RAM is physically organized as 256 deep 
by 96 wide (two instructions are fetched per read 
cycle), a zero delay two-way branch can be 
achieved. The only case in which this field is not 
used to determine the address of the next instruc- 
tion to be executed is when an instruction writes to 
the PC. (The term PC refers to the register that holds 
the address of the next instruction to be executed.) 
When an instruction loads the PC a one instruction 
delay occurs before the load takes effect. Therefore, 
the instruction pointed to by the next instruction field 
of the instruction that loads the PC is executed be- 
fore the jump to the new address occurs. This is 
shown in Table 4-2. 

There are no restrictions on the instruction following 
a PC load; it will always be executed, even while 
single stepping the processor or if the processor is 
frozen on that instruction. 



CFSEL— CONDITION FLAG SELECT FIELD 

This field selects which condition flag will be used 
with the LSB of NADDR to select the next instruction 
from the odd/even pair. The condition flag assign- 
ment is given in Table 4-3. 
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Table 4-2. PC Load Example 



Addr 


Instruction 


NADDR 


Comments 


10 


pc = 


55 


Load PC with zero. 


55 


rO = 1 


X 


This instruction is executed but its next 
address field is ignored. 





r1 = rO 


25 


PC load takes effect after a one instructon delay, 
the result is that r1 = rO = 1 . 



Table 4-3. Condition Flag Select Field Assignments 



Value 


Flag 


Description 


000 


FALSE 


Select for Unconditional Branch 


001 


CARRY 


Carry Out from ALU Condition Flag Latch 


010 


OVF 


Overflow from ALU Condition Flag Latch 


011 


SIGN 


Sign from ALU Condition Flag Latch 


100 


ZERO 


Zero from ALU Condition Flag Latch 


101 


LCNTZ 


TRUE if Selected Loop Counter = 


110 


LSB 


LSB of Data Register rO 


111 


MSB 


MSB of Data Register rO 




NOTE: 

The ALU condition flags (CARRY, OVF, SIGN, and ZERO) are latched in the ALU Condition Flag register. This register is 
updated for most — but not all— ALU operations. The remaining flags (LCNTZ, LSB, and MSB) are updated and latched each 
cycle. 



ASRC— A BUS SOURCE SELECT FIELD 

This field selects the element that should drive its 
data onto the A bus during the execution of this in- 
struction. The mapping for this and the following 
three fields is provided in Chapter 6. 



ADST— A BUS DESTINATION SELECT FIELD 

This field selects which element should latch data 
from the A bus during the execution of this instruc- 
tion. See ASRC above. 



CNT— DECREMENT LOOP COUNTER BIT 

A one in this bit position causes the selected Loop 
Counter (selected by LC, the loop counter select bit) 
to be decremented. The new value of the loop coun- 
ter and the updated LCNTZ condition flag are not 
ready until the next instruction cycle. Therefore, in a 
loop where the loop counter is decremented and 
tested for zero in the same instruction (typically in a 
one instruction loop), the start value for the loop 
counter should be one less than the number of times 
the loop should be executed. 



BSRC— B BUS SOURCE SELECT FIELD 

Same as ASRC, but for B bus. See ASRC above. 

BDST— B BUS DESTINATION SELECT FIELD 

Same as ADST, but for B bus. See ADST above. 



LIT— LITERAL SELECT BIT 

When this bit is a one, the ASRC and CFSEL fields 
are replaced with a 9-bit literal value that is driven as 
a source in the least significant 9 bits of the A bus. In 
this case, the upper 7 bits of the A bus are forced to 
zeros. The mapping of bits from the literal field to the 
A bus is shown in Figure 4-1. 

NOTE 



A conditional branch and a literal on the A bus are 
not allowed in the same instruction. A 3-bit literal 
can be placed on the B bus in any instruction. 
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A bus bits 

Inst. Word Bits 
ASRC Field 
CFSEL Field 



15 14 13 12 11 10 
<— Forced to Zero ■ 



1 



17 16 15 14 13 12 11 10 



Figure 4-1. Literal Field Mapping onto a Bus 



SHFT— SHIFT CONTROL FIELD 

This field controls the bit shifting and byte swapping 
logic associated with register rO . The encoding of 
this field is given in Table 4-4. 

Table 4-4. SHIFT Control Field Coding 



SHFT 


Operation 


00 


No Shift or Swap Operation 


01 


Shift rO Right One Bit 
Position, Sign Extend 


10 


Shift rO Left One Bit 
Position, Zero Fill 


11 


Byte Swap the Value 
Being Loaded into rO* 



*Byte swapping only works when rO is the destination on the 
A bus or the B bus. It does not swap data held in rO, only data 
being loaded. In order to byte swap data in register rO, rO 
must be both a source and destination for either the A or B 
bus. 



ALUSS— ALU SOURCE SELECT BITS 

These two bits are used as enables for the two ALU 
input latches. Bit 39 enables the latch that connects 
to the A bus; bit 38 enables the latch connected to 
the B bus. A one in either bit position causes the 
corresponding input latch to latch the value on the 
bus to which it is connected (the A or B bus). A zero 



on either bit causes the corresponding latch to hold 
its current content. This allows the ALU operands 
either to come from "eavesdropping" on the A or B 
bus transfers occurring in the current instruction cy- 
cle or to be held for multiple instruction cycles in 
either the A or B input latch. 

ALUOP— ALU OPERATION CODE FIELD 

This field specifies the ALU instruction to be* per- 
formed during the current instruction cycle. The en- 
coding of this field is given in Figure 4-2. Normally, at 
the end of the instruction execution, the result of the 
ALU operation is latched in the ALU output latch that 
can be a source on either the A or B buses. Howev- 
er, if a NOP is selected for the ALU operation, the 
ALU output latch is not latched. The data is held 
from the previous instruction, in addition to NOP, 
certain other ALU opcodes do not actually perform 
ALU operations and therefore, do not latch the ALU 
results. They are INT (microcode interrupt) and the 
PROF instruction. 

LC— LOOP COUNTER SELECT BIT 

This bit selects which of the two loop counters is to 
be used for decrementing or Loop-Counter-Zero 
conditional branching in the current instruction. A 
zero selects loop counter zero and a one selects 
loop counter one. 

Refer to the Intel 82750PB Microcode Programming 
Guide for more information on microcode programming. 
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Table 4-5. 82750PB Source/Destination Coding 


Address (Hex) 


BDST 


BSRC 


ADST 


ASRC 


0x0 


Null 


Null 


Null 


Null 


0x1 




alu 




hwid 


0x2 


*dram3 


*dram3 




cc 


0x3 


*dram4 


*dram4 


maddr 




0x4 


*dram3+ + 


*dram3+ + 




alu 


0x5 


*dram4+ + 


*dram4+ + 


cnt 


cnt 


0x6 


*dram3-- 


*dram3 — 


cnt2 


cnt2 


0x7 


*dram4 — 


*dram4 — 


lent 


lent 


0x8 


rO 


rO 


rO 


rO 


0x9 


r1 


r1 


r1 


r1 


OxA 


r2 


r2 


r2 


r2 


OxB 


r3 


r3 


r3 


r3 


OxC 


r4 


>4 


r4 


r4 


OxD 


r5 


r5 


r5 


r5 


OxE 


r6 


r6 


r6 


r6 


OxF 


r7 


x7 


r7 


r7 


0x10 


r8 


*in1 


mcode3 


mcode3 


0x11 


r9 


*in2 


mcode2 


mcode2 


0x12 


no 


*stat 


mcodel 


mcodel 


0x13 


M1 


*stat# 


pc 


pc 


0x14 


M2 


circbuf 


pixint-c 




0x15 


M3 




pixint 


pixint 


0x16 


M4 




*dram1 


*dram1 


0x17 


M5 




*dram2 


*dram2 


0x18 


circbuf 


literal 


*dram1 + + 


*dram1 + + 


0x19 




literal 1 


*dram2+ + 


*dram2+ + 


0x1 A 


*dram1 


literal 2 


*dram1 — 


*dram1 — 


0x1 B 


*dram2 


literal 3 


*dram2 — 


*dram2 


0x1 C 


*dram1 + + 


literal 4 


draml 


draml 


0x1 D 


*dram2+ + 


literal 5 


dram2 


dram2 


0x1 E 


*dram1 — 


literal 6 


dram3 


dram3 


0x1 F 


*dram2 — 


literal 7 


dram4 


dram4 


0x20 


*OUt1 


prof 


*OUt1 


*in1 
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Table 4-5. 82750PB Source/Destination Coding (Continued) 


Address (Hex) 


BDST 


BSRC 


ADST 


ASRC 


0x21 


outl + + 




outl + + 


*in2 


0x22 


out1-lo 


out1-lo 


shift-rl 


*stat 


0x23 


out1-hi 


out1-hi 


out1-hi 


*stat# 


0x24 


*out2 


stat-lo 


*out2 




0x25 


out2+ + 


stat-hi 


out2+ + 




0x26 


out2-lo 


out2-lo 


shift-r 




0x27 


out2-hi 


out2-hi 


out2-hi 




0x28 


out1-c 


OUt1-C 


OUt1-C 




0x29 


in1-c 


in1-c 


in1-c 




0x2A 


in1-lo 


in1-lo 


shift-l 




0x2B 


in1-hi 


in1-hi 


in1-hi 




0x2C 


out2-c 


out2-c 


out2-c 




0x2D 


in2-c 


in2-c 


in2-c 




0x2E 


in2-lo 


in2-lo 






0x2F 


in2-hi 


in2-hi 


in2-hi 




0x30 


stat-ram 


r8 


r8 


r8 


0x31 


stat-c 


r9 


r9 


r9 


0x32 


stat-lo 


r10 


no 


r10 


0x33 


stat-hi 


M1 


r11 


M1 


0x34 


yeven-lo 


M2 


r12 


M2 


0x35 


yeven-hi 


r13 


r13 


r13 


0x36 


yodd-lo 


r14 


M4 


M4 


0x37 


yodd-hi 


r15 


M5 


M5 


0x38 


ypitch 


shift 


cc 


shift 


0x39 




stat-c 


font 


font 


0x3A 


vu-lo 


*dram1 


*dram3 


*dram3 


0x3B 


vu-hi 


*dram2 


*dram4 


*dram4 


0x3C 


vupitch 


*dram1 + + 


*dram3+ + 


*dram3+ + 


0x3D 


vpitch 


*dram2+ + 


*dram4+ + 


*dram4+ + 


0x3E 


vptr-lo 


*dram1 — 


*dram3 — 


*dram3-- 


0x3F 


vptr-hi 


*dram2-- 


*dram4 — 


*dram4 — 
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47 


46 45 


44 43 42 41 40 


39 38 


37 


36 


35 34 33 32 


31 30 29 26 27 26 25 


24 


mcode 1 


mcode2 


15 


14 13 


12 11 10 9 8 


7 6 


5 


4 


3 2 10 


15 14 13 12 11 10 9 


8 


bit 
coding 


LC 
SEL 


SHFT 
CNTL 


ALU 
OPCODE 


ALU 
SS 


LIT 


CNT 


BBus 
Destination 


BBus 
Source 


1 


2 


5 


2 


1 


1 


6 


6 


0x0 


cnt 


nop 


NOP 


hold 


nop 


nop 


null 


null 


0x1 


cnt2 


shftr 


ZERO 


latb 


lit 


dec 




alu 


0x2 




shftl 


a 


lata 




•dram3 


•dram3 


0x3 


swap 


b 


both 


: *dram4 


•dram4 


0x4 




~a 




*dram3 + + 


•dram3 + + 


0x5 


~b 


•dram4 + + 


•dram4 + + 


0x6 


& 


•drams — 


•dram3 - - 


0x7 


~& 


•dram4 — - 


•dram4 


0x8 


&~ 


rO 


rO 


0x9 


+ + 


r1 


r1 


OxA 


I 


r2 


r2 


OxB 


~l 


r3 


r3 


OxC 


l~ 


r4 


r4 


OxD 


- <■ ■ 


r5 


r5 


OxE 


- 


r6 


r6 


OxF 


- +■< ■ 


r7 


r7 


0x10 


+ 


r8. 


♦in1 


0x11 


- 


r9 


•in2 


0x12 


- + . 


no 


•stat 


0x13 


- a 


r11 


•stat# 


0x14 


-b 


M2 


circbuf 


0x15 


a+ + .. 


r13 . 




0x16 


b+ + 


r14 




0x17 


a 


r15 




0x18 


b - - 


circbuf 


literal 


0x19 


int 




literal 1 


0x1 A 


prof 


•draml 


literal 2 


OxlB 


a* 


*dram2 


literal 3 


OxlC 


b* 


•draml + + 


literal 4 


0x1 D 


+ < . 


*dram2 + + 


literal 5 


0x1 E 


• +] 


•draml • — 


literal 6 


0x1 F 


-] 


*dram2 


literal 7 


0x20 




•outl 


prof 


0x21 


outl + + 




0x22 


outl - lo 


outl - lo 


0x23 


outl - hi 


outl - hi 


0x24 


*out2 


stat-lo 


0x25 


out2 + + 


stat-hi 


0x26 


out2 - lo 


out2 - lo 


0x27 


out2 - hi 


out2 - hi 


0x28 


outl - c 


outl - c 


0x29 


in1 - c 


in1 - c 


0x2A 


in1 - lo 


in1 - lo 


0x2B 


in1 - hi 


in1 -hi 


0x2C 


out2 *- c 


OUt2 - C 


0x2D 


in2 - c 


in2 - c 


0x2E 


in2-lo 


in2 - lo 


0x2F 


in2 - hi 


in2 - hi 


0x30 


stat - ram 


r8 


0x31 


stat - c 


r9 


0x32 


stat - lo 


no 


0x33 


stat - hi 


r11 


0x34 


yeven - lo 


r12 


0x35 


yeven - hi 


r13 


0x36 


yodd - lo 


r14 


0x37 


yodd - hi 


r15 


0x38 


ypitch 


shift 


0x39 




stat - c 


0x3A 


vu - lo 


•draml 


Ox3B 


vu - hi 


•dram2 


Ox3C 


vupitch 


•draml + + 


Ox3D 


vpitch 


•dram2 + + 


Ox3E 


vptr - lo 


•draml 


0x3F 


vptr - hi 


*dram2 









Figure 4-2. 82750PB Instruction Word Format 



1-108 



iny. 



82750PB 



















23 22 21 20 19 18 17 16 


15 14 13 12 


11 10 9 


8 7 6 5 4 3 2 


1 




mcode 2 


mcode 3 






7 6 5 4 3 2 10 


15 14 13 12 


11 10 9 


8 7 6 5 4 3 2 


1 




bit 
coding 


A Bus 
Destination 


A Bus 
Source 


Cond Flag 
Select 


Next 
Address 


6 


6 


3 


9 


OxO 


null 


null 


FALSE 








0x1 




hwid 


CARRY 


\ 


0x2 




cc 


OVERFLOW 


0x3 


moddr 




SIGN 


0x4 




alu 


ZERO 


0x5 


cnt 


cnt 


CNTO 


0x6 


cnt2 


cnt2 


LSBrO 


0x7 


lent 


lent 


MSBrO 


0x8 


rO 


rO 




0x9 


r1 


r1 


OxA 


r2 


r2 


OxB 


r3 


r3 


OxC 


r4 


r4 


OxD 


r5 


r5 


OxE 


r6 


r6 


OxF 


r7 


r7 


0x10 


mcode3 


mcode3 


0x11 


mcode2 


mcode2 


0x12 


mcodel 


mcodel 


0x13 


pc 


pc 


0x14 


pixint - c 




0x15 


pixint 


pixint 


0x16 


•drami 


* drami 


0x17 


*dram2 


*dram2 


0x18 


*dram1 + + 


*dram1 + + 


0x19 


•dram2 + + 


*dram2 + + 


0x1 A 


•drami 


*dram1 - - 


0x1 B 


•dram2 


*dram2 - - 


0x1 C 


drami 


drami 


0x1 D 


dram2 


dram2 


0x1 E 


dram3 


dram3 


0x1 F 


dram4 


dram4 


0x20 


♦outl 


•inl 


0x21 


outl + + 


*in2 


0x22 


shift - rl 


•stat 


0x23 


outl - hi 


•stat# 


0x24 


•out2 




0x25 


out2 + + 




0x26 


shift - r 




0x27 


out2 - hi 




0x28 


outl - c 




0x29 


inl - c 




0x2A 


shift -1 




0x2B 


inl - hi 




0x2C 


out2 - c 




0x2D 


in2 - c 




0x2E 






0x2F 


in2 - hi 




0x30 


r8 


r8 


0x31 


r9 


r9 


0x32 


no 


no 


0x33 


>11 


r11 


0x34 


M2 


r12 


0x35 


r13 


r13 


0x36 


M4 


r14 


0x37 


r15 


r15 


0x38 


cc 


shift 


0x39 


font 


fent 


0x3A 


*dram3 


*dram3 


0x3B 


*dram4 


•dram4 


0x3C 


*dram3 + + 


*dram3 + + 


0x3D 


*dram4 + + 


*dram4 + + 


0x3E 


•dram3 


•dram3 


0x3F 


•dram4 


*dram4 













Figure 4-2. 82750PB Instruction Word Format (Continued) 
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5.0 ELECTRICAL DATA 
Maximum Ratings 



Table 5-1 is a stress rating only, and functional operation 
at the maximums is not guaranteed. Functional operat- 
ing conditions are given in the DC and AC Characteris- 
tics (Tables 5-2, 5-3, 5-4, and 5-5). 



Exposure to Maximum Ratings may affect device re- 
liability. Furthermore, although the 82750PB con- 
tains protective circuitry to resist damage from static 
electrical discharge, always take precautions to 
avoid high static voltages or electric fields. 



DC Characteristics 



Table 5-1. Absolute Maximum Requirements 



Condition 


Maximum Requirement 


Case Temparature under Bias 


-§5°Cto110°C 


Storage Temperature 


-65°Cto150°C 


Voltage on Any Pin with Respect to Ground 


-0.5V to V cc + 0.5V 


Supply Voltage with Respect to V ss 


-0,5V to + 6.5V 


Table 5-2. DC Characteristics V cc - 5V ±10%, T CASE = 0°C to 90°C 


Symbol 


Parameter 


Min, ; ;i 


Wyp,i 


Max 


Unit 


Notes 


V,L 


Input LOW Voltage 


-0.3 




¥ ^*M 


V 


(Notel) 


V, H 


Input HIGH Voltage 


2.0 




V cc + 0.3 


V 


(Note!) 


Vol 


Output LOW Voltage 




0,2 


0.4 


V 


I 0L = 4.0 mA (1) 


Vqh 


Output HIGH Voltage ; ;%^ 


f§#f|*j 


3.0 




V 


l 0H = -1.0 mA 111 , 


I.L 


Input Leakage Current 


;: ' 'yi^ 




+10 


^A 


V SS <V IN <V CC 


»oz 


Output Leakage Current 


Jf : 1o 




+10 


uA 


V SS <V IN <V CC 


'cc 


Power Supply Current 




150 


200 


mA 


25MHz< 2 > 


C IN 


Input Capacitance 






10.0 


PF 


F c = 1 MHz< 3 > 


C OUT 


Output Capacitance 






12.0 


PF 


F c = 1 MHz< 3 > 


CcLKIN 


CLKIN Input Capacitance 






20.0 : 


PF 


F c = 1 MHz (3 > 



NOTES: 

1 . Measured with CLKIN = 8 MHz. 

2. Typical current value measured under typical conditions. Maximum current value guaranteed with 50 pF maximum output 
loading. 

3. Not 100% tested. 
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AC Characteristics 



Table 5-3. AC Characteristics at 25 MHz V cc = 5V ±10%, 


t case = 0°Cto + 90°C,C 


L = 50 pF 


Symbol 


Parameter 


Min 


Max 


Unit 


Figure 


Notes 




Frequency 


8 


25 


MHz 




IxClock 


*1 


CLKIN Period 


40 


125 


ns 


5-1 




*2 


CLKIN High Time 


14 


26 


ns 


% 5 " 1 


(Notel) 


*3 


CLKIN Low Time 


14 


26 


ns f 


??^5-1 


(Notel) 


U 


CLKIN Fall Time 




4 


mV 


5-1 




*5 


CLKIN Rise Time 




4 


ns 


V s - 5-1 




*6a 


A[31:2],BE#[3:0],WE#, 
D[31:0], HINT#, PMFRZ# 
Valid Delay 


3 


25*1;: 


?' ns^ff 


5-2 




*6b 


MREQ #, TRNFR #, RFSH #, 
NXTFST #, HBUSEN #, 
HRDY #, Valid Delay 


3 ,,;# 


^ ifeS 


/\,,J^I- 


5-2 




h 


A[31:2],BE#{3:0],WE#, 
D[31 :0] Float Delay 




30 


ns 


5-2 


(Note 2) 


*8 


MRDY # Setup 


.s,: : %t> im B 




ns 


5-3 




*9 


MRDY#Hold 


>ZA^ 




ns 


5-3 




l 10 


HREQ #, VBUS[3;0J, RESET #, 
HALEN #, HALT # Setup 


8 




ns 


5-3 




*11 


HREQ #, VBUS[3:0], RESET #, 
HALEN #, HALT # Hold 


6 




ns 


5-3 




*12 


A[8:2], BE # [3:0], WE #, 
D[31 :0] Setup 


4 




ns 


5-3 


(Note 3) 


*13 


A[8:2], BE # [3:0], WE #, 
D[31 :0] Hold 


6 




ns 


5-3 


(Note 3) 


*14 


HREG #, HRAM # Setup 


10 




ns 


5-3 




^15 


HREG #, HRAM # Hold 


6 




ns 


5-3 




l 16 


CLKOUT Valid Delay 




18 


ns 


5-4 




M7 


CLKOUT High Time 


'■1/21,-6 


1/2^+6 


ns 


5-4 






NOTES: 

1. This assumes 40 ns period. For other speeds these values should fall between 40% to 60% duty cycle. 

2. Not 100% tested. Guaranteed by design characterization. 

3. Inputs must remain valid throughout all cycles of host accesses. See Figures 3-6 through 3-8. 

4. All A.C. specifications are measured at the 1.5V crossing point with a 50 pF load. 
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Figure 5-1. Clock Waveforms 
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Figure 5-2. Output Waveforms 
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Figure 5-3. Input Waveforms 
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Figure 5-4. CLKOUT Waveforms 
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Output Delay and Rise Time Versus Load Capacitance 










nom +4 

Typical 
Output 

Dalay nom +2 
(ns) 

nom 












































50 7& 100 125 150 

C L (picofarads) 

240854-22 

NOTE: 

This graph will not be linear outside of the Cl range shown, 
nom = nominal value given in A.C. Characteristics table. 




Figure 5-5. Typical Output Valid Delay Versus Load Capacitance under Worst Case Conditions 



7 

6 

5 

Rise 4 
Tlmo (ns) 
0..8V-2.0V 3 

2 

1 
















































































NOTE: 

This graph will not be linear outside of the Cl ran< 


25 50 73 100 125 150 

C L (picofarads) 

240854-23 

}e shown. 



Figure 5-6. Typical Output Rise Time Versus Load Capacitance under Worst Case Conditions 
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6.0 MECHANICAL DATA 
Packaging Outlines and Dimensions 

Intel packages the 82750PB in a Plastic Quad Flat Pack (PQFP). Table 6-1 gives the symbol list for the PQFP. 

Table 6-1. PQFP Symbol List 



Letter or 
Symbol 


Description of Dimensions 


A 


Package Height: Distance from Seating Plane to Highest Point of Body 


Ai 


Standoff: Distance from Seating Plane to Base Plane 


D/E 


Overall Package Dimension: Lead Tip to Lead Tip 


Di/E! 


Plastic Body Dimension 


D 2 /E 2 


Bumper Distance 


D3/E3 


Footprint 


Li 


Foot Length 


N 


• Total Number of Leads 



The PQFP has the following specifications: 

1. All dimensions and tolerances conform to ANSI Y14.5M-1982. 

2. Datum plane — H — is located at the mold parting line and coincident with the bottom of the lead where lead 
exits plastic body. 

3. Datums A-B and — D— are to be determined where center leads exit plastic body. at datum plane — H — . 

4. Controlling dimension is the inch. 

5. Dimensions D-|, D2, E-|, and E 2 are measured at the mold parting line and do not include mold protrusion. 
Allowable mold protrusion is 0.18 mm (0.007 in.) per side. 

6. Pin 1 identifier is located within one of the two zones indicated. 

7. Measured at datum plane — H— . 

8. Measured at seating plane datum — C — . 
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Table 6-2 provides outline characteristics for 0.025 in. pitch. 

Table 6-2. Intel Case Outline Drawings for PQFP at 0.025 inch Pitch 



Symbol 


Description 


Min 


Max 


N 


Leadcount 


132 


132 


A 


Package Height 


0.160 


0.170 


Ai 


Standoff 


0.020 


0.030 


D, E 


Terminal Dimiension 


1.075 


1.085 


Di,E 1 


Package Body 


0.0947 


0.953 


D 2 ,E 2 


Bumper Distance 


1.097 


1.103 


D 3 ,E 3 


Lead Dimension 


0.800 REF 


0.800 REF 


M 


Foot Length 


0.020 


0.030 



n 



Li 




h 



■02- 
-D- 
•01- 



ES& 




^ pmmilimiiuii^ 
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Figure 6-1. Principal Dimensions of the 82750PB in the 132-Lead PQFP Package 
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Figure 6-2. Detailed Dimensions of the 82750PB in the 132-Lead PQFP— Molding Details 
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Figure 6-3. Detailed Dimensions of the 82750PB in the 132-Lead PQFP— Terminal Details 
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Figure 6-4. 132-Lead PQFP Mechanical Package Detail— -Protective Bumper 



010.13 (.005>©lClA©-B©lO©lA 




0.41 (.016) 
8.20 (.008) 



0.31 (.012) <H H> 
0.20 (.008) 



04/E4 — — e» 




0.20 (.008) 
'0.14 (.005) 



JIG 



1010.20 (.008)®|ClA©-B©lQ©lA 

DETAIL J DETAIL L 



8 0E0. 
D£Q. 



mm (inch) 
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NOTESi 

/\ ALL DIMENSIONS AND TOLERANCES CONFORM TO ANSI Y14.5M-1982 

A DATUM PLAN£ QS LOCATEO AT THE MOLO PARTING LINE AND 

COINCIDENT ff ITH THE BOTTOM OF TH£ LEAO WHERE LEAO EXITS PLASTIC BOOY 

/z\ OATUMS E3 AND BD3 TO 8E DETERMINED f*€RE CENTER LEADS EXIT 
PLASTIC BOOY AT DATUM PLANE EB3 

ft\ CONTROLLING DIMENSION, INCH 

A\ DIMENSIONS 01, 02, Ei AND E2 ARE ^A8URED AT THE MOLD PARTING LINE. 
01 AND El 00 NOT INCLUDE AN ALL01ABLE MOLD PROTRUSION OF 0.18 m 
(,g§7 IN) P€R SIDE. 02 AND E2 00 MOT INCLUDE A TOTAL ALLOCABLE 
MOLO PR0TRU8I0N OF §. 18 MM (.If 7 IN) AT MAXIMUM PACKAGE SIZE. 

/\ PIN 1 IDENTIFIER IS LOCATED fITHiN 0N£ OF TK€ TWO ZONES INDICATED 

/7\ MEASURED AT OATUM PLANE EH3 

/S\ MEASURED AT SEATING PLANE OATUM S3 
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Package Thermal Specifications 

The 82750PB is specified for operation when Tc 
(the case temperature) is within the range of 0°C to 
90°C. Tc may be measured in any environment to 
determine whether the 82750PB is within specified 
operation range. The case temperature should be 
measured at the center of the top surface. 



Ta (the ambient temperature) can be calculated 
from 0ca (thermal resistance from case to ambient) 
with the following equation: 

T A = T c - P*0ca 

Typical values for 0qa at various airflows are given 
in Table 6-3 for the 132-lead PQFP package. Table 
6-4 shows the maximum Ta allowable (wihout ex- 
ceeding Tc) at various airflows. The power dissipa- 
tion (P) is calculated by using the typical supply cur- 
rent at 5V as shown in Table 5-2. 




Table 6-3. Thermal Resistance (°C/W) 





#CA Versus Airflow — ft/min (m/sec) 


Package 



(0) 


200 
(1.01) 


400 
(2.03) 


600 
(3.04) 


800 
(4.06) 


1000 
(5.07) 


132-Lead 
PQFP 


26.0 


17.5 


14.0 


11.5 


9.5 


8.5 



Table 6-4. Maximum Ta at Various Airflows (°C) 





Ta Versus Airflow— ft/min (m/sec) 


Package 


Frequency 
(MHz) 



(0) 


200 
(1.01) 


400 
(2.03) 


600 
(3.04) 


800 
(4.06) 


1000 
(5.07) 


132-Lead 
PQFP 


25 


70 


76 


80 


81 


83 


84 
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J860TM XP MICROPROCESSOR 



Parallel Architecture that Supports Up 
to Three Operations per Clock 

— One Integer or Control Instruction 

— Up to Two Floating-Point Results 

High Performance Design 

— 40/50 MHz Clock Rate 

— 100 Peak Single Precision MFLOPS 

— 75 Peak Double Precision MFLOPS 

— 64-Bit External Data Bus 

— 64-Bit Internal Code Bus 

— 128-Bit Internal Data Bus 

High Integration on One Chip 

— 32-Bit Integer and Control Unit 

— 32/64-Bit Pipelined Floating-Point 

— 64-Bit 3-D Graphics Unit 

— Paging Unit with 64 Four-Kbyte and 
16 Four-Mbyte Pages 

— 16 Kbyte Code Cache 
— 16 Kbyte Data Cache 

Fast, Multiprocessor-Oriented Bus 

— Burst Cycles Move 400 Mbyte/Sec 

— Hardware Cache Snooping 

— MESI Cache Consistency Protocol 

— Supports Second-Level Cache 

— Supports DRAM 



El 



Compatible with Industry Standards 

— ANSI/IEEE Standard 754-1985 for 
Binary Floating-Point Arithmetic 

— Intel 386TM/|ntel 486tm/j860tm Data 
Formats and Page Table Entries 

— Binary Compatible with i860™ XR 
Applications Instruction Set 

— Detached Concurrency Control Unit 
(CCU) Supports Parallel Architecture 
Extensions (PAX) 

— JEDEC 262-pin Ceramic Pin Grid 
Array Package 

— IEEE Standard 1149.1/D6 Boundary- 
Scan Architecture 

Easy to Use 

— On-Chip Debug Register 

— UNIXV860 

— APX Attached Processor Executive 

— Assembler, Linker, Simulator, 
Debugger, C and FORTRAN 
Compilers, FORTRAN Vectorizer, 
Scalar and Vector Math Libraries 

— Graphics Libraries 




The Intel i860 XP Microprocessor (order code A80860XP) delivers supercomputing performance in a single 
VLSI component. The 32/64-bit architecture of the i860 XP microprocessor balances integer, floating point, 
and graphics performance for applications such as engineering workstations, scientific computing, 3-D graph- 
ics workstations, and multiuser systems. Its parallel architecture achieves high throughput with RISC design 
techniques, multiprocessor support, pipelined processing units, wide data paths, large on-chip caches, 2.5 
million transistor design, and fast 0.8-micron silicon technology. 



A31-A3 D63-D0 CONTROL 



FP REGISTER FILE 




GRAPHICS UNIT 



Figure 0.1. Block Diagram 

*UNIX is a registered trademark of UNIX System Laboratories, Inc. 
Intel, i860, Intel386 and Intel486 are trademarks of Intel Corporation. 
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.1.0 FUNCTIONAL DESCRIPTION 

As shown by the block diagram on the front page, 
the i860 XP Microprocessor consists of the following 
units: 

1. Integer Registers and Core Execution Unit 

2. Floating-Point Registers and Control Unit 

3. Floating-Point Adder Unit 

4. Floating-Point Multiplier Unit 

5. Graphics Unit 

6. Paging Unit 

7. Instruction Cache 

8. Data Cache 

9. Bus and Cache Control Unit 

10. Detached Concurrency Control Unit 

The core execution unit controls overall operation of 
the i860 XP microprocessor. It executes load, store, 
integer, bit, I/O, and control-transfer operations, and 
fetches instructions for the floating-point unit as well. 
A set of 32 x 32-bit general-purpose registers are 
provided for the manipulation of integer data. Load 
and store instructions move 8-, 1 6-, and 32-bit data 
to and from these registers. Its full set of integer, 
logical, and control-transfer instructions give the 
core unit the ability to execute complete systems 
software and applications programs. A trap mecha- 
nism provides rapid response to exceptions and ex- 
ternal interrupts. Debugging is supported by the abili- 
ty to trap on data or instruction reference. 

The floating-point hardware is connected to a sepa- 
rate set of floating-point registers, which can be ac- 
cessed as 16 x 64-bit registers or as 32 x 32-bit 
registers. Load and store instructions can also ac- 
cess these same registers as 8 x 128-bit registers. 
All floating-point and graphics instructions use these 
registers as their source and destination operands. 

The floating-point control unit controls both the float- 
ing-point adder and the floating-point multiplier, issu- 
ing instructions, handling all source and result ex- 
ceptions, and updating status bits in the floating- 
point status register. The adder and multiplier can 
operate in parallel, producing up to two results per 
clock. The floating-point data types, floating-point in- 
structions, and exception handling all support the 
IEEE Standard for Binary Floating-Point Arithmetic 
(ANSI/IEEE Std 754-1985). 

The floating-point adder performs addition, subtrac- 
tion, comparison, and conversions on 64- and 32-bit 
floating-point values. An adder instruction executes 
in three clocks; however, in pipelined mode, a new 
result is generated every clock. 



The floating-point multiplier performs floating-point 
and integer multiply as well as floating-point recipro-. 
cal operations on 64- and 32-bit floating-point val- 
ues. A multiplier instruction executes in three to four 
clocks; however, in pipelined mode, a new result can 
be generated every clock for single-precision and 
every other clock for double precision. 

The graphics unit supports three-dimensional draw- 
ing in a graphics frame buffer, with color intensity 
shading and hidden surface elimination via the 
Z-buffer algorithm. The graphics unit recognizes the 
pixel as an 8-, 16-, or 32-bit integer data type. It can 
compute individual red, blue, and green color inten- 
sity values within a pixel; but it does so with parallel 
operations that take advantage of the 64-bit internal 
word size and 64-bit external bus. The graphics fea- 
tures of the i860 XP microprocessor assume that the 
surface of a solid object is drawn with polygon 
patches which, like the pieces of a puzzle, collec- 
tively approximate the shape of the original object. 
The color intensities of the vertices of the polygon 
and their distances from the viewer are known, but 
the distances and intensities of the other points 
must be calculated by interpolation. The graphics in- 
structions of the i860 XP microprocessor directly aid 
such interpolation. 

The paging unit implements protected, paged, virtual 
memory. The paging unit uses two four-way set-as- 
sociative cache memories called TLBs (Translation 
Lookaside Buffers) to perform the translation of logi- 
cal address to physical address, and to check for 
access violations. The access protection scheme 
employs two levels of privilege: user and supervisor. 
One TLB supports 4 Kbyte pages, and has 64 en- 
tries; the other supports 4 Mbyte pages, and has 1 6 
entries. 

The instruction cache is a four-way set-associative 
memory of 16 Kbytes, with 32-byte lines. It transfers 
up to 64 bits per clock (400 Mbyte/sec at 50 MHz). 

. The data cache is a four-way set-associative memo- 
ry of 1 6 Kbytes, with 32-byte lines. It transfers up to 
128 bits per clock (800 Mbyte/sec at 50 MHz). The 
i860 XP microprocessor normally uses write-back 
caching, i.e. memory writes update the cache (if ap- 
plicable) without necessarily updating memory im- 
mediately; however, under both software and hard- 
ware control, write-through and write-once policies 
can be implemented, or caching can be inhibited. 
The caches are transparent to applications soft- 
ware. 

The bus and cache control unit performs data and 
instruction accesses for the core unit. It receives cy- 
cle requests and specifications from the core unit, 
performs the data-cache or instruction-cache miss 
processing, controls TLB translation, and provides 
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the interface to the external bus. Its pipelined struc- 
ture supports up to three outstanding bus cycles. Its 
burst mode transfers data at up to 400 Mbyte/sec at 
50 MHz. In multiprocessor systems, it maintains 
cache consistency by monitoring bus activity in par- 
allel with other CPU functions. 

The DCCU (detached concurrency control unit) is a 
compatible subset of the external CCU that expe- 
dites loop-level parallelism and synchronization in 
multiprocessor systems. The DCCU consists of reg- 
isters and a counter that allow a single i860 XP mi- 
croprocessor to run binary code compiled for a mul- 
tiprocessor system adhering to the PAX parallel ap- 
plications binary interface (ABI). 

The i860 XP microprocessor may to be used with or 
without an external, secondary cache built from 
82495XP and 82490XP cache components. An 
82495XP and 82490XP cache provides up to 512 
Kbytes of high-speed storage for data and instruc- 
tion combined. In most cases, an 82495XP and 
82490XP cache can provide data to the CPU with 
zero wait states. The larger size of an external cache 
can provide an increased hit rate when the size or 
number of data structures and programs exceeds 
the size of the internal caches. In multiprocessor 
systems, the external cache serves as local memo- 
ry, and can reduce bus traffic. An external cache 
also hides the processor from rest of system/which 
is a double advantage: 

1 . The processor can be upgraded without affecting 
design of the memory and other subsystems. 

2. Slower and less expensive memory and I/O sub- 
system designs can be employed without unduly 
lowering overall system performance. 

Refer to the 82495XP Cache Controller/ 82490XP 
Cache RAM Data Sheet (Intel Order #240956) for 
more information. 



2.0 PROGRAMMING INTERFACE 

The programmer-visible aspects of the architecture 
of the i860 XP microprocessor include data types, 
registers, instructions, and traps. 



2.1 Data Types 

The i860 XP microprocessor provides operations for 
integer and floating-point data. Integer operations 
are performed on 32-bit operands with some support 
also for 64-bit operands. Load and store instructions 
can reference 8-bit, i 6-bit, 32-bit, 64-bit, and 128-bit 
operands. Floating-point operations are performed 
on IEEE-standard 32- and 64-bit formats. Graphics 
instructions operate on arrays of 8-, 16-, or 32-bit 
pixels. 



2.1.1 INTEGER 

An integer is a 32-bit signed value in standard two's 
complement form. A 32-bit integer can represent a 
value in the range -2,147,483,648 (-2 31 ) to 
2,147,483,647 ( + 23 1 - 1). Arithmetic operations on 
8- and 1 6-bit integers can be performed by sign-ex- 
tending the 8- or 16-bit values to 32 bits, then using 
the 32-bit operations. 

There are also add and subtract instructions that op- 
erate on 64-bit long integers. 

Load and store instructions may also reference (in 
addition to the 32- and 64-bit formats previously 
mentioned) 8- and 16-bit items in memory. When an 
8- or 16-bit item is loaded into a register, it is con- 
verted to an integer by sign-extending the value to 
32 bits. When an 8- or 16-bit item is stored from a 
register, the corresponding number of low-order bits 
of the register are used. 



2.1.2 ORDINAL 

Arithmetic operations are available for 32-bit ordi- 
nals. An ordinal is an unsigned integer. An ordinal 
can represent values in the range to 
4,294,967,295 ( + 232 - 1). 

Also, there are add and subtract instructions that op- 
erate on 64-bit ordinals. 



2.1.3 SINGLE- AND DOUBLE-PRECISION REAL 

Figure 2.1 shows the real number formats. A single- 
precision real (also called "single real") data type is 
a 32-bit binary floating-point number. Bit 31 is the 
sign bit; bits 30..23 are the exponent; and bits 22..0 
are the fraction. In accordance with ANSI/IEEE 
standard 754, the value of a single-precision real is 
defined as follows: 

1. If e = and f # or e = 255 then generate a 
floating-point source-exception trap when en- 
countered in a floating-point operation. 

2. If < e <: 255, then the value is (-1)s x 1.f x 
2^-127. 

3. If e = and f = 0, then the value is signed zero. 

A double-precision real (also called "double real") 
data type is a 64-bit binary floating-point number. Bit 
63 is the sign bit; bits 62.. 52 are the exponent; and 
bits 51. .0 are the fraction. In accordance with ANSI/ 
IEEE standard 754, the value of a double-precision 
real is defined as follows: 

1 . If e = and f ^ or e = 2047, then generate a 
floating-point source-exception trap when en- 
countered in a floating-point operation. 
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2. If < e < 2047, then the value is ( - 1 )s x 1 .f x 
26-1023. 

3. If e = and f = 0, then the value is signed zero. 

The special values infinity, NaN ("Not a Number"), 
indefinite, and denormal generate a trap when en- 
countered. The trap handler implements IEEE-stan- 
dard results. 

A double real value occupies an even/odd pair of 
floating-point registers. Bits 31..0 are stored in the 
even-numbered floating-point register; bits 63.. 32 
are stored in the next higher odd-numbered floating- 
point register. 



2.1.4 PIXEL 

A pixel may be 8-, 16-, or 32-bits long, depending on 
color and intensity resolution requirements. Regard- 



less of the pixel size, the i860 XP microprocessor 
always operates on 64 bits of pixel data at a time. 
The pixel data type is used by two kinds of instruc- 
tions: 

• The selective pixel-store instruction that helps im- 
plement hidden surface elimination. 

* The pixel add instruction that helps implement 
3-D color intensity shading. 

To perform color intensity shading efficiently in a va- 
riety of applications, the i860 XP microprocessor de- 
fines three pixel formats according to Table 2.1. 

Figure 2.2 illustrates one way of assigning meaning 
to the fields of pixels. These assignments are for 
illustration purposes only. The i860 XP microproces- 
sor defines only the field sizes, not the specific use 
of each field. Other ways of using the fields of pixels 
are possible. 
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Figure 2.1. Real Number Formats 
Table 2.1. Pixel Formats 



Pixel 

Size 

(in bits) 



Bits of 

Color 1 

Intensity^) 



Bits of 

Color 2 

Intensity* 1 ) 



Bits of 

Color 3 

Intensity* 1 ) 



Bits of 

Other 

Attribute 

(Texture, Color) 



8 
16M 
32 



N(<;8)bitsofintensity(2) 
6 



8-N 

8 



NOTES: 

1 . The intensity attribute fields may be assigned to colors in any order convenient to the application. 

2. With 8-bit pixels, up to 8 bits can be used for intensity; the remaining bits can be used for any other attribute, such as 
color or texture. Bits that require interpolation (shading), such as those for intensity, must be the low-order bits of the pixel. 
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NOTE: 

These assignments of specific meanings to the fields of pixels are for illustration only. Only the field sizes are defined, 
not the specific use of each field. 



Figure 2.2. Pixel Format Example 



2.2 Register Set 



As Figure 2.3 shows, the i860 XP microprocessor 
has the following registers: 

* An integer register file 

• A floating-point register file 

• Control registers psr, epsr, db, dirbase, fir, fsr, 
bear, ccr, p3, p2, p1, pO 

* Special-purpose registers KR, Kl, T, MERGE, 
STAT, and NEWCURR 

The control registers are accessible only by load 
and store control-register instructions; the integer 
and floating-point registers are accessed by arithme- 
tic operations and load and store instructions. The 
special-purpose registers KR, Kl, and T are used by 
floating-point instructions; MERGE is used by graph- 
ics instructions. NEWCURR and STAT are used for 
concurrency control; they are accessed by memory 
load and store instructions. 

2.2.1 INTEGER REGISTER FILE 

There are 32 integer registers, each 32 bits wide, 
referred to as rO through r31, which are used for 
address computation and scalar integer computa- 
tions. Register rO always returns zero when read. 

2.2.2 FLOATING-POINT REGISTER FILE 

There are 32 floating-point registers, each 32-bits 
wide, referred to as fO through f31, which are used 
for floating-point computations. Registers fO and f 1 
always return zero when read. The floating-point 
registers are also used by a set of integer opera- 
tions, primarily for graphics computations. 



When accessing 64-bit floating-point or integer val- 
ues, the i860 XP microprocessor uses an even/odd 
pair of registers. When accessing 128-bit values, it 
uses an aligned set of four registers (fO, f4, f8, f 12, 
f 16, f20, f24, or f28). The instruction must designate 
the lowest register number of the set of registers 
containing 64- or 128-bit values. Misaligned register 
numbers produce undefined results. The register 
with the lowest number contains the least significant 
part of the value. For 128-bit values, the register pair 
with the lower number contains the value from the 
lower memory address; the register pair with the 
higher number contains the value from the higher 
address. 

The 128-bit load and store instructions, along with 
the 128-bit data path between the floating-point reg- 
isters and the data cache, help to sustain an extraor- 
dinarily high rate of computation. 

2.2.3 PROCESSOR STATUS REGISTER 

The processor status register (psr) contains miscel- 
laneous state information for the current process. 
Figure 2.4 shows the format of the psr. 

• BR (Break Read) and BW (Break Write) enable a 
data access trap when the operand address 
matches the address in the db register and a 
read or write (respectively) occurs. 

• Various instructions set CC (Condition Code) ac- 
cording to tests they perform. The branch-on- 
condition-code instructions test its value. The bia 
instruction sets and tests LCC (Loop Condition 
Code). 

• IM (Interrupt Mode), if set, enables external inter- 
rupts on the INT pin; disables interrupts on INT if 
clear. IM does not affect parity error interrupts or 
interrupts on the BERR pin. 
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Figure 2.3. Registers and Data Paths 



U (User Mode) is set when the i860 XP micro- 
processor is executing in user mode; it is clear 
when the i860 XP microprocessor is executing in 
supervisor mode. In user mode, writes to some 
control registers are inhibited. This bit also con- 
trols the memory protection mechanism. 

PIM (Previous Interrupt Mode) and PU (Previous 
User Mode) save the corresponding status bits 
(IM and U) on a trap, because those status bits 
are changed when a trap occurs. They are re- 
stored into their corresponding status bits when 
returning from a trap handler with a branch indi- 
rect instruction when a trap flag is set in the psr. 



FT (Floating-Point Trap), DAT (Data Access 
Trap), I AT (Instruction Access Trap), IN (Inter- 
rupt), and IT (Instruction Trap) are trap flags. 
They are set when the corresponding trap condi- 
tion occurs. IN is set on INT, bus error and parity 
error. The trap handler examines these bits (and 
other trap bits in the epsr) to determine which 
condition or conditions have caused the trap. 

DS (Delayed Switch) is set if a trap occurs during 
the instruction before dual-instruction mode is en- 
tered or exited. If DS is set and DIM (Dual Instruc- 
tion Mode) is clear, the i860 XP microprocessor 
switches to dual-instruction mode one instruction 
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Figure 2.4. Processor Status Register 



after returning from the trap handler. If DS and DIM 
are both set, the i860 XP microprocessor switches 
to single-instruction mode one instruction after re- 
turning from the trap handler. 

» When a trap occurs, the i860 XP microprocessor 
sets DIM if it is executing in dual-instruction 
mode; it clears DIM if it is executing in single-in- 
struction mode. If DIM is set after returning from a 
trap handler, the i860 XP microprocessor re- 
sumes execution in dual-instruction mode. 

» When KNF (Kill Next Floating-Point Instruction) is 
set, the next floating-point instruction is sup- 
pressed (except that its dual-instruction mode bit 
is interpreted). A trap handler sets KNF if the 
trapped floating-point instruction should not be 
reexecuted. 

» SC (Shift Count) stores the shift count used by 
the last right-shift instruction. It controls the num- 
ber of shifts executed by the double-shift instruc- 
tion. 

9 PS (Pixel Size) and PM (Pixel Mask) are used by 
the pixel-store and other graphics instructions. 
The values of PS control pixel size as defined by 
Table 2.2. The bits in PM correspond to pixels to 
be updated by the pixel-store instruction pst.d. 
The low-order bit of PM corresponds to the low- 
order pixel of the 64-bit source operand of pst.d. 
The number of low-order bits of PM that are actu- 
ally used is the number of pixels that fit into 
64-bits, which depends upon PS. If a bit of PM is 
set, then pst.d stores the corresponding pixel. 
Refer also to the pst.d instruction in section 10. 



Table 2.2. Values of PS 



Value 


Pixel Size 
in Bits 


Pixel Size 
in Bytes 


00 
01 
10 

11 


8 

16 

32 

(undefined) 


1 

2 

4 

(undefined) 



2.2.4 



EXTENDED PROCESSOR STATUS 
REGISTER 



The extended processor status register (epsr) con- 
tains additional state information for the current pro- 
cess beyond that stored in the psr. Figure 2.5 shows 
the format of the epsr. 

• The processor type is 2 for the i860 XP micro- 
processor. 

• . The stepping number has a unique value that dis- 

tinguishes among different revisions of the proc- 

ocenr 

• IL (Interlock) is set if a trap occurs after a lock 
instruction but before the last BRDY# of the load 
or store following the subsequent unlock 
instruction. IL indicates to the trap handler that a 
locked sequence has been interrupted. When the 
trap handler finds IL set, it should scan back- 
wards for the lock instruction and restart at that 
point. The absence of a lock instruction within 
30-33 instructions of the trap indicates a pro- 
gramming error. 
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Figure 2.5. Extended Processor Status Register 



WP (write protect) controls the semantics of the 
W bit of page table entries. A clear W bit in either 
the directory or the page table entry causes 
writes to be trapped. When WP is clear, writes 
are trapped in user mode, but not in supervisor 
mode. When WP is set, writes are trapped in both 
user and supervisor modes. 

PEF (parity error flag) is set by the i860 XP micro- 
processor when a parity error trap occurs. As 
soon as PEF is set, further parity error and bus 
error traps are masked. Software must clear PEF 
to reenable such traps. PEF is set at RESET. 

BEF (bus error flag) is set by the i860 XP micro- 
processor when the BERR pin is asserted, indi- 
cating a bus error. As soon as BEF is set, further 
parity error and bus error traps are masked. Soft- 
ware must clear BEF to reenable such traps. BEF 
is set at RESET. 

INT (Interrupt) is the value of the INT input pin. 

DCS (Data Cache Size) is a read-only field that 
tells the size of the on-chip data cache. The num- 
ber of bytes actually available is 2 12 + DCS ; 
therefore, a value of zero indicates 4 Kbytes, one 
indicates 8 Kbytes, etc. The value of DCS for the 
i860 XP microprocessor is two, which indicates 
16 Kbytes. 

PBM (Page-Table Bit Mode) has no effect in 
the i860 XP microprocessor. PBM is used by the 
i860 XR microprocessor. 

BE (Big Endian) controls the ordering of bytes 
within a data item in memory. Normally (i.e. when 
BE is clear) the i860 XP microprocessor operates 
in little endian mode, in which the addressed byte 
is the low-order byte. When BE is set (big endian 



mode), the low-order three bits of all 32-bit data 
load and store addresses are complemented, 
then masked to the appropriate boundary for 
alignment. This causes the addressed byte to be 
the most significant byte. Big endian mode af- 
fects not only the memory load and store instruc- 
tions but also the Idio, stio, idint, and scyc 
instructions. 

® OF (Overflow Flag) is set by adds, addu, subs, 
and subu when integer overflow occurs. For 
adds and subs, OF is set if the carry from bit 31 
is different than the carry from bit 30. For addu, 
OF is set if there is a carry from bit 31. For subu, 
OF is set if there is no carry from bit 31 . Under all 
other conditions, it is cleared by these instruc- 
tions. OF may be changed by arithmetic instruc- 
tions in either user or supervisor mode. It may be 
changed by the st.c instruction in supervisor 
mode only. OF controls the function of the intovr 
instruction. Inside the trap handler, OF may not 
be valid for traps other than one caused by 
intovr. 

• BS (bus or parity error trap in supervisor mode) is 
set by the i860 XP microprocessor when a bus or 
parity error occurs during a supervisor mode 
memory access cycle. This is true even though 
the processor may have switched to user mode 
by the time these errors are reported. The BS bit 
contains valid information only if BERR is assert- 
ed in the same clock as BRDY# or one clock 
after that. In all other conditions the contents of 
the BS bit are undefined. The operating system 
can use this bit to decide, for example, whether 
to abort the process (user mode) or reboot the 
system (supervisor mode). 
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Dl (trap on delayed instruction) is set by the 
i860 XP microprocessor when a trap occurs on a 
delayed instruction (the instruction located after a 
delayed branch instruction). When Dl is set, the 
trap handler must restart the interrupted proce- 
dure from the branch instruction rather than at 
the address in fir. 

TAI (trap on autoincrement instruction) is set by 
the i860 XP microprocessor when a trap occurs 
on an instruction with autoincrement. When TAI is 
set, the trap handler should undo the autoincre- 
ment (that is, restore src2 to its original value). 

PT (trap on pipeline use) indicates to the i860 XP 
microprocessor that a trap should be generated 
and PI should be set when it executes an instruc- 
tion that uses the floating-point or graphics unit. 
Such instructions include all the instructions des- 
ignated "Floating-Point Unit" in Table 2.9, plus 
the pfld instruction. PT is set and cleared only by 
software. It can be used by the trap handler to 
avoid unnecessary saving and restoring of the 
pipelines (refer to section 2.8). When a trap due 
to PT occurs, the floating-point operation has not 
started, and the pipelines have not been ad- 
vanced. Such a trap also sets the IT bit of psr. 

The behavior of PI (pipeline instruction) depends 
on the setting of PT. If PT = 0, the i860 XP mi- 
croprocessor sets PI when any pipelined instruc- 
tion or pfld is executed. If PT = 1, the processor 
sets PI and traps when it decodes any instruction 
that uses the pipes, whether scalar or pipelined. 
PI may be set even if KNF is set and the next 
floating point instruction is suppressed. Refer to 
section 2.8. 

SG (strong ordering) indicates whether the proc- 
essor is in strong ordering mode (SO = 1) or weak 
ordering mode (SO = 6). SO is set if the EWBE# 
pin is active (LOW) at RESET. (Refer to the para- 
graphs on write cycle reordering in section 5.) 



2.2.5 DATA BREAKPOINT REGISTER 

The data breakpoint register (db) is used to gener- 
ate a trap when the i860 XP microprocessor access- 
es an operand at the virtual address stored in this 
register. The trap is enabled by BR and BW in psr. 
When comparing, a number of low order bits of the 
address are ignored, depending on the size of the 
operand. For example, a 16-bit access ignores the 
low-order bit of the address when comparing to db; 
a 32-bit access ignores the low-order two bits. This 
ensures that any access that overlaps the address 
contained in the register will generate a trap. The 
trap occurs before the register or memory update by 
the load or store instruction. 



2.2.6 DIRECTORY BASE REGISTER 

The directory base register dirbase (shown in Figure 
2.6) controls address translation, caching, and bus 
options. 

• ATE (Address Translation Enable), when set, en- 
ables the virtual-address translation algorithm. 

• DPS (DRAM Page Size) controls how many bits 
to ignore when comparing the current bus-cycle 
address with the previous bus-cycle address to 
generate the NENE# signal. This feature allows 
for higher speeds when using static column or 
page-mode DRAMs and consecutive reads and 
writes access the same column or page. The 
comparison ignores the low-order 12 + DPS bits. 
A value of zero is appropriate for one bank of 
256K X n RAMs, 1 for 1M X n RAMS, etc. For 
interleaved memory, increase DPS by one for 
each power of interleaving— add one for 2-way, 
two for 4-way, etc. 
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Figure 2.6. Directory Base Register 
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When BL (Bus Lock) is set, external bus access- 
es are locked. The LOCK# signal is asserted 
with the next bus cycle (excluding instruction 
fetch and write-back cycles) whose internal bus 
request is generated after BL is set. It remains set 
on every subsequent bus cycle as long as BL re- 
mains set. The LOCK# signal is deasserted on 
the next load or store instruction after BL is 
cleared. Traps immediately clear BL. The lock 
and unlock instructions control the BL bit. The 
result of modifying BL with the st.c instruction is 
not defined. 

ITI (Cache and TLB Invalidate), when set in the 
value that is loaded into dirbase, causes all en- 
tries in the instruction cache and virtual tags in 
the address-translation cache (TLB) to be invali- 
dated. Also invalidates all virtual tags in the data 
cache. The ITI bit does not remain set in dirbase. 
ITI always appears as zero when reading 
dirbase. 

When software sets the LB bit, the i860 XP micro- 
processor enters two-clock late back-off mode. 
This mode gives two additional clock periods of 
decision time to the external logic that may need 
to use the BOFF# signal to cancel a bus cycle or 
data transfer. If the processor enters one-clock 
late back-off mode during RESET via configura- 
tion pin strapping, the LB bit has no effect, and it 
is impossible to enter two-clock late back-off 
mode. Furthermore, software cannot exit two- 
clock late back-off mode once it is activated; the 
LB bit cannot be cleared except by resetting the 
processor. 

When CS8 (Code Size 8-Bit) is set, instruction 
cache misses are processed as 8-bit bus cycles. 
When this bit is clear, instruction cache misses 
are processed as 64-bit bus cycles. This bit can 
not be set by software; hardware sets this bit at 
initialization time. It can be cleared by software 
(one time only) to allow the system to execute out 
of 64-bit memory after bootstrapping from 8-bit 
EPROM. A nondelayed branch to code in 64-bit 
memory should directly follow the st.c (store con- 
trol register) instruction that clears CS8, in order 
to make the transition from 8-bit to 64-bit memory 
occur at the correct time. The branch instruction 
must be aligned on a 64-bit boundary. 

RB (Replacement Block) identifies the cache line 
(block) to be replaced by cache replacement al- 
gorithms. RB conditions the cache flush instruc- 
tion flush, which is discussed in Section 10. Ta- 
ble 2.3 explains the values of RB. 

RC (Replacement Control) controls cache re- 
placement algorithms. Table 2.4 explains the sig- 
nificance of the values of RC. 



DTB (Directory Table Base) contains the high-or- 
der 20 bits of the physical address of the page 
directory when address translation is enabled (i.e. 
ATE = 1). The low-order 12 bits of the address 
are zeros. 





Table 2.3. Values of RB 


Value 


Replace 
TLB Block 


Replace Instruction 
and Data Cache Block 


00 
01 

I 

I I 



1 
2 
3 



1 
2 
3 



Table 2.4. Values of RC 



Value 



00 



01 



10 



11 



Meaning 



Selects the normal (random) 
replacement algorithm where any block 
in the set may be replaced on cache 
misses in all caches. 

Instruction, data, and TLB cache misses 
replace the block selected by RB. This 
mode is used for cache and TLB testing. 

Data cache misses replace the block 
selected by RB. Instruction and TLB 
caches use random replacement. This 
mode is used when flushing the data 
cache with the flush instruction. 

Disables data and TLB caches 
replacement. Instruction cache uses 
random replacement. 




2.2.7 FAULT INSTRUCTION REGISTER 

When a trap occurs, this register contains the ad- 
dress of the trapping instruction (not necessarily the 
instruction that created the conditions that required 
the trap). The fir is a read-only register. In single-in- 
struction mode, using a Id.c instruction to read the 
fir anytime except the first time after a trap saves in 
idest the address of the Id.c instruction; in dual-in- 
struction mode, the address of its floating-point com- 
panion (address of the Id.c - 4) is saved. 

2.2.8 FLOATING-POINT STATUS REGISTER 

The floating-point status register (fsr) contains the 
floating-point trap and rounding-mode status for the 
current process. Figure 2.7 shows its format. 
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If FZ (Flush Zero) is clear and underflow occurs, 
a result-exception trap is generated. When FZ is 
set and underflow occurs, the result is set to zero, 
and no trap due to underflow occurs. 

If Tl (Trap Inexact) is clear, inexact results do not 
cause a trap. If Tl is set, inexact results cause a 
trap. The sticky inexact flag (SI) is set whenever 
an inexact result is produced, regardless of the 
setting of Tl. 

RM (Rounding Mode) specifies one of the four 
rounding modes defined by the IEEE standard. 
Given a true result b that cannot be represented 
by the target data type, the i860 XP microproces- 
sor determines the two representable numbers a 
and c that most closely bracket b in value (a < 
b < c). The i860 XP microprocessor then rounds 
(changes) b to a or c according to the mode se- 
lected by RM as defined in Table 2.5. Rounding 
introduces an error in the result that is less than 
one least-significant bit. 



Table 2.5. Values of RM 



Value 


Rounding Mode 


Rounding Action 


00 


Round to 


Closer to b of a or c; 




nearest or even 


if equally close, 
select even number 
(the one whose 
least significant bit 
is zero). 


01 


Round down 
(toward -oo) 


a 


10 


Round up 
(toward + °°) 


c 


11 


Chop 


Smaller in 




(toward zero) 


magnitude of a or c. 



TRAP INEXACT 

ROUNDING MODE - 

UPDATE 

FLOATING-POINT TRAP ENABLE - 

STICKY INEXACT FLAG 

SOURCE EXCEPTION 

MULTIPLIER UNDERFLOW 

MULTIPLIER OVERFLOW 

MULTIPLIER INEXACT — 

MULTIPLIER ADD ONE 

ADDER UNDERFLOW — 

ADDER OVERFLOW 
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Figure 2.7. Floating-Point Status Register 
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The U-bit (Update Bit), if set in the value that is 
loaded into fsr by a st.c instruction, enables up- 
dating of the result-status bits (AE, AA, Al, AO, 
AU, MA, Ml, MO, and MU) in the first-stage of the 
floating-point adder and multiplier pipelines. If this 
bit is clear, the result-status bits are unaffected 
by a st.c instruction; st.c ignores the correspond- 
ing bits in the value that is being loaded. An st.c 
always updates fsr bits 21.. 17 and 8..0 directly. 
The U-bit does not remain set; it always appears 
as zero when read. 

The FTE (Floating-Point Trap Enable) bit, if clear, 
disables all floating-point traps (invalid input oper- 
and, overflow, underflow, and inexact result). 

SI (Sticky Inexact) is set when the last-stage re- 
sult of either the multiplier or adder is inexact (i.e. 
when either Al or Ml is set). SI is "sticky" in the 
sense that it remains set until reset by software. 
Al and Ml, on the other hand, can by changed by 
the subsequent floating-point instruction. 

SE (Source Exception) is set when one of the 
source operands of a floating-point operation is 
invalid; it is cleared when all the input operands 
are valid. Invalid input operands include denor- 
mals, infinities, and all NaNs (both quiet and sig- 
naling). 

When read from the fsr, the result-status bits MA, 
Ml, MO, and MU (Multiplier Add-One, Inexact, 
Overflow, and Underflow, respectively) describe 
the last-stage result of the multiplier. 

When read from the fsr, the result-status bits AA, 
Al, AO, AU, and AE (Adder Add-One, Inexact, 
Overflow, Underflow, and Exponent, respectively) 
describe the last-stage result of the adder. The 
high-order three bits of the 1 1 -bit exponent of the 
adder result are stored in the AE field. 

The Adder Add-One and Multiplier Add-One bits 
indicate that the absolute value of the result frac- 
tion grew by one least-significant bit due to 
rounding. AA and MA are not influenced by the 
sign of the result. 

After a floating-point operation in a given unit (ad- 
der or multiplier), the result-status bits of that unit 
are undefined until the point at which result ex- 
ceptions are reported. 

When written to the fsr with the U-bit set, the 
result-status bits are placed into the first stage of 
the adder and multiplier pipelines. When the 
processor executes pipelined operations, it prop- 
agates the result-status bits of a particular unit 
(multiplier or adder) one stage for each pipelined 
floating-point operation for that unit. When they 
reach the last stage, they replace the normal re- 
sult-status bits in the fsr and generate traps, if 
enabled. When the U-bit is not set, result-status 
bits in the word being written to the fsr are ig- 
nored. 



In a floating-point dual-operation instruction (e.g. 
add- and-multiply or subtract-and-multiply), both 
the multiplier and the adder may set exception 
bits. The result-status bits for a particular unit re- 
main set until the next operation that uses that 
unit. 

RR (Result Register) specifies which floating- 
point register (f0-f31) was the destination register 
when a result-exception trap occurs due to a sca- 
lar operation. 

IRP (Integer (Graphics) Pipe Result Precision), 
MRP (Multiplier Pipe Result Precision), and ARP 
(Adder Pipe Result Precision) aid in restoring 
pipeline state after a trap or process switch. Each 
defines the precision of the last-stage result in 
the corresponding pipeline. One of these bits is 
set when the result in the last stage of the corre- 
sponding pipeline is double precision; it is cleared 
if the result is single precision. 

LRP1 and LRPO (Load Pipe Result Precision) to- 
gether define the size of the last-stage result of 
the load pipeline. They are encoded as Table 2.6 
shows. 

Table 2.6. Values of LRP1 and LRPO 




LRP1 


LRPO 


pfld Length 




1 
1 



1 


1 


(reserved) 
4 Bytes 
8 Bytes 
16 Bytes 



2.2.9 KR, Kl, T, AND MERGE REGISTERS 

The KR, Kl, and T registers are special-purpose reg- 
isters used by the dual-operation floating-point in- 
structions pfam, pfsm, pfmam, and pfmsm, which 
initiate both an adder operation and a multiplier op- 
eration. The KR, Kl, and T registers can store values 
from one dual-operation instruction and supply them 
as inputs to subsequent dual-operation instructions. 
(Refer to Figure 2.16.) 

The MERGE register is used only by the graphics 
instructions. The purpose of the MERGE register is 
to accumulate (or merge) the results of multiple-ad- 
dition operations that use as operands the color-in- 
tensity values from pixels or distance values from a 
Z-buffer. The accumulated results can then be 
stored in one 64-bit operation. 

Two multiple-addition instructions and an OR in- 
struction use the MERGE register. The addition in- 
structions are designed to add interpolation values 
to each color-intensity field in an array of pixels or to 
each distance value in a Z-buffer. 
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Refer to the Instruction descriptions in section 10 for 
more information about these registers. 

2.2.10 BUS ERROR ADDRESS REGISTER 

The bear helps the trap handler determine faulty 
memory locations. The i860 XP microprocessor 
loads a valid address into bear under these condi- 
tions: 

• For bus errors, the bear receives the address of 
the cycle for which the BERR signal is asserted, if 
external hardware asserts BERR in the same 
clock as it asserts BRDY# or one clock later. 

• For parity errors on a read, the bear receives the 
address of the cycle during which the processor 
detects the error, if external hardware asserts 
PEN# with BRDY# for that cycle. 

If external hardware does not meet these conditions, 
the contents of the bear are undefined. 

A valid address in bear is accurate to 29 bits; that is, 
address signals A31-A3 are latched in the high-or- 
der 29 bits of bear. At RESET and after every parity 
and bus error trap, software must read the bear be- 
fore further parity and bus error traps can occur. The 
bear is a read-only register. 

2.2.11 PRIVILEGED REGISTERS 

The registers pO, p1 p2, and p3 are provided for the 
operating system to use. They do not affect proces- 
sor operation. They can be accessed by the Id.c and 
st.c instructions, but they can be written only in su- 
pervisor mode. They may be used to store informa- 
tion such as the interrupt stack pointer, current user 
stack pointer at the beginning of the trap handler, 
register values during trap handling, processor ID in 
a multiprocessor system, or for any other purpose. 



2.2.12 CONCURRENCY CONTROL REGISTER 

The concurrency control register (ccr) controls the 
operation of the internal Concurrency Control Unit 
(CCU), which is described in section 2.5. The ccr 
can be written in supervisor mode only, but can be 
read in user or supervisor mode. Figure 2.8 shows 
the format of the ccr. 

DO (Detached Only) bit and CO (CCU On) bit togeth- 
er specify the CCU configuration. DO, when set, indi- 
cates that there is no external CCU. CO (CCU On) 
bit, when set, indicates that the Concurrency Control 
Architecture is enabled. Table 2.7 summarizes the 
modes defined by CO and DO bits. The reserved 
combinations should not be used by software. 

If the DCCU is on (CO = DO=1), the processor in- 
tercepts and interprets all memory loads and stores 
which are to the CCU address space, which is the 
two pages defined by CCUBASE. Loads and stores 
to that address range do not go to memory, but to 
the DCCU. 





Table 2.7. Values of CO and DO 


CO 


DO 


Mode 





1 
1 




i 



r 


External CCU, or no CCU 

reserved 

reserved 
Internal CCU (DCCU) only 



CCUBASE is the virtual address of the memory area 
into which the CCU registers are mapped. Software 
must set bit 12 to zero, because the CCUBASE must 
be aligned on a two page (8 Kbyte) boundary. This is 
because an external CCU contains supervisor regis- 
ters mapped to the second page. 
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Figure 2.8. Concurrency Control Register 
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2.2.13 NEWCURR REGISTER 

The NEWCURR register is part of the detached CCU 
(concurrency control unit). It a 32-bit counter that 
supplies an iteration count for loop execution. (Refer 
to section 2.5.) 

NEWCURR is architecturally a 64-bit register, but 
only the low-order 32 bits are provided in this imple- 
mentation. Compiler and operating-system data 
structures should provide for a 64-bit size for future 
implementation. 



2.2.14 STAT REGISTER 

The STAT register is part of the detached CCU (con- 
currency control unit). As Figure 2.9 shows, it con- 
tains the following bits: 

In Loop Indicates that the processor is currently 
executing a concurrent loop. This bit is 
set when a processor starts a concur- 
rent, non-nested loop, and it is cleared 
when the processor enters serial code 
when not nested or idle. It can also be 
read or written directly. 

Nested Indicates whether the processor is in the 
nested state. InLoop is copied into this 
bit when starting a nested loop. Other- 
wise, it can be read or written directly. 

Detached Always contains the value of ccr bit DO. 

STAT is architecturally a 64-bit register. Compiler 
and operating-system data structures should provide 
for a 64-bit size for future implementation. 



2.3 Addressing 

Memory is addressed in byte units with a paged vir- 
tual-address space of 2 32 bytes. Data and instruc- 
tions can be located anywhere in this address 
space. Address arithmetic uses 32-bit input values 
and produces 32-bit results. The low-order 32 bits of 
the result are used in case of overflow. 



Normally, multibyte data values are stored in memo- 
ry in little endian format, i.e. with the least significant 
byte at the lowest memory address. As an option, 
the ordering can be dynamically selected by soft- 
ware in supervisor mode. The i860 XP microproces- 
sor also offers big endian mode, in which the most 
significant byte of a data item is at the lowest ad- 
dress. Figure 2.10 defines by example how data is 
transferred from memory over the bus into a register 
in both modes. Big endian and little endian data ar- 
eas should not be mixed within a 64-bit data word. 
Illustrations of data structures in this data sheet 
show data stored in little endian mode, i.e. the right- 
most (low-order) byte is at the lowest memory ad- 
dress. 

Code accesses are always done with little endian 
addressing. This implies that instructions appear dif- 
ferently than documented here when accessed as 
big endian data. Intel Corporation recommends that 
disassemblers running in a big endian system con- 
vert instructions that have been read as data back to 
little endian form and present them in the format 
documented here. 

Page directories and page tables are also accessed 
in little endian mode, regardless of the value of the 
BE bit. 

Big endian mode affects not only the memory load 
and store instructions but also the Idio, stio, Idint, 
and scyc instructions. 

Alignment requirements are as follows (any violation 
results in a data-access trap): 

© 128-bit values are aligned on 16-byte boundaries 
when referenced in memory (i.e. the four least 
significant address bits must be zero). 

• 64-bit values are aligned on 8-byte boundaries 
when referenced in memory (i.e. the three least 
significant address bits must be zero). 

• 32-bit values are aligned on 4-byte boundaries 
when referenced in memory (i.e. the* two least 
significant address bits must be zero). 
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Figure 2.9. Concurrency Status Register 
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INSTRUCTION 



Id.b 0(r0),r16 
Id.b I(r0),r16 
Id.b 2(r0), r16 
Id.b 3(rO),r16 
Id.b 4(r0),r16 
Id.b 5(r0),r16 
Id.b 6(r0),r16 
Id.b 7(rO),r16 



Id.s 0(r0),r16 
Id.s 2(r0),r16 
Id.s 4(r0),r16 
Id.s 6(r0),r16 



Id.l O(rO), r16 
Id.l 4(r0), r16 



LITTLE ENDIAN 



Byte Enables 
Asserted 

(BEn#) 


1 

2 
3 
4 
5 
6 



1:0 
3:2 
5:4 
7:6 
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dO d31d0 
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DCBA 
HGFE 




DCBA 
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BIG ENDIAN 



Byte Enables 
Asserted 



(BEn#) d63 



7:6 
5:4 
3:2 
1:0 



7:4 
3:0 



NOTE: 

64- and 1 28-bit big endian accesses are treated the same as little endian accesses 
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Figure 2.10. Little and Big Endian Memory Transfers 



16-bit values are aligned on 2-byte boundaries 
when referenced in memory (i.e. the least signifi- 
cant address bit must be zero). 



bit must be set if the operating system is to imple- 
ment page-oriented protection or page-oriented vir- 
tual memory. 



2.4 Virtual Addressing 

When address translation is enabled, the processor 
maps instruction and data virtual addresses into 
physical addresses before referencing memory. This 
address transformation is compatible with that of the 
Intel386 and Intel486 microprocessors and imple- 
ments the basic features needed for page-oriented 
virtual -memory systems and page-level protection. 

The address translation is optional. Address transla- 
tion is disabled when the processor is reset. It is 
enabled when a store (st.c) to dirbase sets the ATE 
bit. The operating system typically does this during 
software initialization. Address translation is dis- 
abled again when st.c clears the ATE bit. The ATE 



2.4.1 PAGE FRAME 

A page frame is a unit of contiguous addresses of 
physical main memory. A page is the collection of 
data that occupies a page frame when that data is 
present in main memory or occupies some location 
in secondary storage when there is not sufficient 
space in main memory. 

The i860 XP microprocessor architecture supports 
two sizes of pages and page frames: four Mbytes 
and four Kbytes. Four Kbyte page frames begin on 
four Kbyte boundaries and are fixed in size. Four 
Mbyte page frames begin on four Mbyte boundaries 
and are fixed in size. The four Kbyte address trans- 
formation is compatible with that of the Intel 486 mi- 
croprocessor. 
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2.4.2 VIRTUAL ADDRESS 

A virtual address refers indirectly to a physical ad- 
dress by specifying a page and an offset within that 
page. Figure 2.11 shows the formats of virtual ad- 
dressess. The format for virtual addresses that refer 
to four Mbyte pages is different from that of four 
Kbyte pages. 

Figure 2.12 shows how the i860 XP microprocessor 
converts a virtual address into the physical address 
by consulting page tables. The addressing mecha- 
nism uses the DIR field as an index into a page di- 
rectory. For 4K pages, it uses the PAGE field as an 
index into the page table determined by the page 
directory and uses the OFFSET field to address a 
byte within the page determined by the page table. 
For 4M pages, the page directory entry determines 
the page address, and the OFFSET field addresses 
a byte within that page table. 



2.4.3 PAGE TABLES 

A page table is simply an array of 32-bit page specifi- 
ers. A page table is itself a page, and contains 
4 Kbytes of data or at most 1 K 32-bit entries. 

At the highest level is a page directory. The page 
directory holds up to 1 K entries that address either 
page tables of the second level or 4-Mbyte pages. 

A page table of the second level addresses up to 1 K 
4-Kbyte pages. All the tables addressed by one 
page directory, therefore, can address 1M 4-Kbyte 
pages. 

Whether 4-Mbyte pages, 4-Kbyte pages, or some 
combination of the two are used, one page directory 
can cover the entire four gigabyte physical address 
space of the i860 XP microprocessor (1K page di- 
rectory entries x 4M page or 1 K page directory en- 
tries x 1 K page table entries x 4K page). 
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The physical address of the current page directory is 
stored in the DTB field of the dirbase register. Mem- 
ory management* software has the option of using 
one page directory for all processes, one page direc- 
tory for each process, or some combination of the 
two. 



2.4.4 PAGE-TABLE ENTRIES 

Page-table entries (PTEs) have one of the formats 
shown by Figure 2.13. 



2.4.4.1 Page Frame Address 

The page frame address specifies the physical start- 
ing address of a page. In a page directory, the page 
frame address is either the address of a page table 
or the address of the four Mbyte page frame that 
contains the desired memory operand. In a second- 
level page table, the page frame address is the ad- 
dress of the 4-Kbyte page frame that contains the 
desired memory operand. 
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2.4.4.2 Present Bit 

The P (present) bit indicates whether a page table 
entry can be used in address translation. P= 1 indi- 
cates that the entry can be used. When P = in ei- 
ther level of page tables, the entry is not valid for 
address translation, and the rest of the entry is avail- 
able for software use; none of the other bits in the 
entry is tested by the hardware. If P = in either lev- 
el of page tables when an attempt is made to use a 
page-table entry for address translation, the proces- 
sor signals either a data-access fault or an instruc- 
tion-access fault. In software systems that support 
paged virtual memory, the trap handler can bring the 
required page into physical memory. 

Note that there is no P bit for the page directory 
itself. The page directory may be not-present while 
the associated process is suspended, but the oper- 
ating system must ensure that the page directory 
indicated by the dirbase image associated with the 
process is present in physical memory before the 
process is dispatched. 

2.4.4.3 Writable and User Bits 

The W (writable) and U (user) bits are used for page- 
level protection, which the i860 XP microprocessor 
performs at the same time as address translation. 
The concept of privilege for pages is implemented 
by assigning each page to one of two levels: 

Supervisor level For the operating system 

(U = o) ar| d other systems software 

and related data. 

For applications procedures 
and data. 



User level (U = 1) 



The U bit of the psr indicates whether the i860 XP 
microprocessor is executing at user or supervisor 
level. The i860 XP microprocessor maintains the 
U bit of psr as follows: 

• The i860 XP microprocessor clears the psr U bit 
to indicate supervisor level when a trap occurs 
(including when the trap instruction causes the 
trap). The prior value of U is copied into PU. 

• The i860 XP microprocessor copies the psr 
PU bit into the U bit when an indirect branch is 
executed and one of the trap bits is set. If PU was 
one, the i860 XP microprocessor enters user lev- 
el. 

With the U bit of psr and the W and U bits of the 
page table entries, the i860 XP microprocessor im- 
plements the following protection rules: 

• When at user level, a read or write of a supervi- 
sor-level page causes a trap. 



• When at user level, a write to a page whose W bit 
is not set causes a trap. 

• When at user level, a store (st.c) to certain con- 
trol registers is ignored. 

• When at user level, privileged instructions (Idio, 
stio, scyc, Idint) have no effect. 

When the i860 XP microprocessor is executing at 
supervisor level, all pages are addressable, but, 
when it is executing at user level, only pages that 
belong to the user level are addressable. 

When the i860 XP microprocessor is executing at 
supervisor level, all pages are readable. Whether a 
page is writable depends upon the write-protection 
mode controlled by WP of epsr: 

WP = All pages are writable. 

WP =1 A write to page whose W bit is not set 
causes a trap. 

When the i860 XP microprocessor is executing at 
user level, only pages that belong to user level and 
are marked writable are actually writable; pages that 
belong to supervisor level are neither readable nor 
writable from user level. 



2.4.4.4 Write-Through Bit 

The i860 XP microprocessor implement both write- 
back and write-through caching policies for the on- 
chip instruction and data caches. If WT is set, the 
write-through policy is applied to data from the cor- 
responding page. If WT is clear, the normal write- 
back policy is applied to data from the page. 

For four-Mbyte pages, the WT bit of the page direc- 
tory entry is used. For four-Kbyte pages, only the WT 
bit of the second-level page table entry is used; the 
WT bit of the page directory entry is not referenced 
by the processor, but is reserved. 

The value of the WT bit is driven externally on the 
PWT pin, so that external caches can employ the 
same policy used internally. 



2.4.4.5 Cache Disable Bit 

If a page's CD (cache disable) bit is set, data from 
the page is not placed in the internal instruction or 
data caches (regardless of the value of the WT bit). 
Clearing CD permits the processor to place data 
from the associated page into internal caches. 

For four-Mbyte pages, the CD bit of the page direc- 
tory entry is used. For four-Kbyte pages, only the CD 
bit of the second-level page table entry is used; the 
CD bit of the page directory entry is not referenced 
by the processor, but is reserved. 
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The value of the CD bit is driven externally on the 
PCD pin, so that cacheability can be the same in 
both internal and external caches. 



2.4.4.6 Accessed and Dirty Bits 

The A (accessed) and D (dirty) bits provide data 
about page usage in both levels of the page tables. 

The i860 XP microprocessor sets the A-bit before a 
read or write operation to a page. For four-Kbyte 
pages, it sets the A-bit of both levels of page tables. 

The processor tests the dirty bit before a write, and, 
under certain conditions, causes traps. The trap 
handler then has the opportunity to maintain appro- 
priate values in the dirty bits. For four-Mbyte pages, 
the D bit of the page directory entry is used. For four- 
Kbyte pages, only the D bit of the second-level page 
table entry is used; the D bit of the page directory 
entry is not referenced by the processor, but is 
reserved. The precise algorithm for using these bits 
is specified in section 2.4.5. 

An operating system that supports paged virtual 
memory can use the D and A bits to determine what 
pages to eliminate from physical memory when the 
demand for memory exceeds the physical memory 
available. The D and A bits are normally initialized to 
zero by the operating system. The processor sets 
the A bit when a page is accessed either by a read 
or write operation. When a data-access fault occurs, 
the trap handler sets the D bit if an allowable write is 
being performed, then reexecutes the instruction. 

The operating system is responsible for coordinating 
its updates to the accessed and dirty bits with up- 
dates by the CPU and by other processors that may 
share the page tables. The i860 XP microprocessor 
automatically asserts the LOCK# signal while test- 
ing and setting the A bit. 



2.4.4.7 Page Tables for Trap Handlers 

When paging is enabled (ATE = 1), software that 
creates page tables and directories must assure that 
A = 1 always in the PTEs and PDEs for the code 
pages of the trap handler and the first data page 
accessed by the handler. Preallocation of these 
pages is required in case a trap occurs during a lock 
sequence. Otherwise, recursive traps would be gen- 
erated, as the A-bit would need to be set by the 
translation hardware, which is a trapping situation in 
itself. 



2.4.4.8 Combining Protection of Both Levels of 
Page Tables 

For any four-Kbyte page, the protection attributes of 
its page directory entry may differ from those of its 
page table entry. The i860 XP microprocessor com- 
putes the effective protection attributes for a page 
by examining the protection attributes in both the 
directory and the page table and choosing the more 
restrictive of the two. 



2.4.5 ADDRESS TRANSLATION ALGORITHM 

The following algorithm defines the translation of 
each virtual address to a physical address. Let DIR, 
PAGE, and OFFSET be the fields of the virtual ad- 
dress; let PFA1 and PFA2 be the page frame ad- 
dress fields of the first and second level page tables 
respectively; DTB is the page directory table base 
address stored in the dirbase register. 

1. Read the PDE (Page Directory Entry) at the 
physical address formed by DTB:DIR:00. 

2. If P in the PDE is zero, generate a data- or in- 
struction-access fault. 

3. If W in the PDE is zero, the operation is write, 
and either the U bit of the PSR is set or WP = 1 , 
generate a data-access fault. 

4. If the U bit in the PDE is zero and U bit in the psr 
is set, generate a data- or instruction-access 
fault. 

5. If A in the PDE is zero and the TLB miss oc- 
curred inside a locked sequence, generate a 
data or instruction access fault. (The trap allows 
software to set A to one and restart the se- 
quence. This helps external bus hardware deter- 
mine unambiguously what address corresponds 
to a locked semaphore.) 

6. If bit 7 of the PDE is one (four Mbyte page), and 
the operation is write, and D = in the PDE, 
generate a data-access fault. 

7. If A = 1 in the PDE, continue at step 1 1 . Other- 
wise, assert LOCK #. 

8. Perform the PDE read as in step 1 and the P, W 
and U bit checks as in steps 2 through 4. 

9. Write the PDE with A bit set. 

10. Deassert LOCK#. 

11. If bit 7 of the PDE is one (four Mbyte page), form 
the physical address as PFA1 :OFFSET, and exit 
address translation. In this case, PFA1 is 10 bits 
and OFFSET is 22 bits. 

12. The remaining steps are for four Kbyte pages. If 
the A-bit in the PDE was zero before translation 
began, assert LOCK #'. 
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13. Fetch the PTE at the physical address formed 
by PFA1:PAGE:00. 

14. Perform the P-, W-, U-, and A-bit checks as in 
steps 2 through 5 with the second-level PTE. If 
A = zero in the PTE, and the TLB miss oc- 
curred inside a locked sequence, generate a 
data or instruction access fault. LOCK# re- 
mains active. 

15. If the operation is write, and D in the PTE is 
zero, generate a data access fault. 

16. If the A-bit in the PDE was already active before 
translation began, and the A-bit in the PTE is 
already active, go to step 20. 

17. If LOCK# is not already active, assert it and 
refetch the PTE. 

18. Perform the U-, W-, and P-bit checks and A-bit 
setting in the PTE as in steps 8 through 9. Do 
the locked write update of the PTE to unlock the 
bus, even if the A-bit in the PTE is already one. 

19. Deassert LOCK#. 

20. Form the physical address as PFA2:OFFSET. In 
this case, PFA2 is 20 bits and OFFSET is 12 
bits. 



ing among CPUs, in multiprocessor systems. The 
CCU is a VLSI chip that allows multiple processors 
to work together to execute portions of a single pro- 
gram in parallel. The CCU performs the iteration as- 
signment for loop parallelization. Accesses to the 
CCU for synchronization are much faster than ac- 
cesses to shared memory semaphores. The CCU is 
memory mapped, and its internal registers are ac- 
cessed via memory load and store operations. 

To take advantage of the parallel architecture, soft- 
ware must be compiled by parallelizing compilers 
that generate instructions to access the CCU. How- 
ever, such instructions cannot run on a system that 
does not include a CCU. To allow an application 
compiled for parallel execution to run on any system 
based on the i860 XP microprocessor, a "Detached 
Only" CCU (DCCU, also referred to as "internal 
CCU") is implemented in the i860 XP microproces- 
sor. The DCCU is a compatible subset of the exter- 
nal CCU, consisting of the minimal set of features 
required for a single CPU. The DCCU alone neither 
increases performance nor concurrency, but does 
allow software designed for parallel processing to 
run unmodified on a single CPU. 




During translation, the i860 XP microprocessor looks 
only in external memory for page directories and 
page tables. The data cache is not searched. There- 
fore, any code that modifies page directories or 
page tables must keep them out of the cache. The 
tables should either be kept in noncacheable memo- 
ry or in write-through pages or should be flushed 
from the cache. 

The i860 XP microprocessor expects page directo- 
ries and page tables to be in little endian format. The 
operating system must maintain these tables in little 
endian format either by setting BE to zero when ma- 
nipulating the tables or by complementing bit two of 
the 32-bit address when loading or storing entries. 

2.4.6 ADDRESS TRANSLATION FAULTS 

The address translation fault can be signalled as ei- 
ther an instruction access fault or a data-access 
fault. The instruction causing the fault can be reexe- 
cuted upon returning from the trap handler. 



2.5 Detached CCU 

The i860 XP microprocessor supports parallel pro- 
cessing, where multiple processors work simulta- 
neously on different parts of the same problem. The 
Concurrency Control Unit (CCU) controls work shar- 



2.5.1 DCCU INITIALIZATION 

After reset, the i860 XP microprocessor DCCU is dis- 
abled (CO and DO bits in ccr are cleared). To en- 
able the DCCU, the CO and DO bits in ccr must be 
set by software. Before turning on the CCU, the op- 
erating system must invalidate the TLB and flush the 
data cache to make sure that they do not contain 
data from the CCU pages. The TLB is invalidated by 
setting ITI = 1 in the dirbase register. Also, the 
flush instruction must be used once per each line of 
the data cache to invalidate the physical address of 
the cache entry, if the two pages at the CCUBASE 
address may have been cached. The flush is un- 
needed if page tables or external hardware have 
prohibited caching of the CCUBASE pages. 

Neither the external CCU nor the DCCU can be ac- 
cessed within four instructions after ccr is modified. 



2.5.2 DCCU ADDRESSING 

The CCU facilities are memory-mapped, manipulat- 
ed by normal load and store instructions. The DCCU 
is memory-mapped to a single 4 Kbyte user page. 
When the DCCU is active, all accesses to this page 
are satisfied by the DCCU, and no external bus cycle 
is generated. The address space of two adjacent 
pages beginning on an 8 Kbyte boundary is reserved 
for the CCU. The first (lower address) page contains 
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locations accessible in user mode (which includes 
the DCCU registers), and the second page contains 
locations accessible in supervisor mode (used for 
external CCU only). The base address of these 
pages is specified by the CCUBASE field in ccr. Ac- 
cesses to the second page in DCCU-only mode 
have no effect on the DCCU, and are treated as 
normal memory accesses. 

When the DCCU is active, accesses to its address 
page use only the virtual address, and no translation 
is done on the DCCU access. However, the access- 
es to an external CCU go through normal address 
translation. The operating system should make sure 
that the page table entries for the CCU pages are 
set so that no fault occurs during address transla- 
tion. If an external CCU is used, the two PTEs for the 
CCU should have CD = 1 (caching disabled) and 
page frame addresses that match the external hard- 
ware addresses of the CCU. Accesses to the DCCU 
that cause a TLB miss do not cause the PTE to be 
loaded into the TLB. 

If the external CCU is used when address translation 
is disabled (ATE = 0), external hardware must deac- 
tivate KEN# for such accesses, to avoid caching 
external CCU accesses. 



2.5.3 DCCU INTERNALS 

The DCCU consists of an address decoder, a 32-bit 
counter (NEWCURR), and three bits of state infor- 
mation (InLoop, Nested, and Detached). InLoop, 
Nested and Detached correspond to bits 0, 1 , and 2 
respectively of the external CCU STAT register. The 
Detached bit always reflects the value of the DO bit 
in ccr. 

Several addresses within the DCCU memory page 
are decoded to cause actions to NEWCURR, In- 
Loop, and Nested state bits. The CCU register to be 
accessed is specified by address bits 1 1 -3. The val- 
id CCU addresses are shown in Table 2.8 with their 
mnemonics. Accesses to these address may also 
have side effects within the DCCU. Refer to the 
/#60tm Microprocessor Family Programmer's Refer- 
ence Manual for programming information. Loads 
from any other addresses within the DCCU memory 
page return zero; stores to any other addresses 
have no effect. Access to the DCCU by any load or 
store instructions other than Id.x and st.x produce 
undefined results. 

Assemblers should encode address bits 2-0 as zero 
for accesses in little-endian mode. However, in big- 
endian mode (epsr BE bit = 1), DCCU accesses 
should have address bit 2 active. Thus, software for 



big-endian access to the DCCU must differ from lit- 
tle-endian software. That allows an external CCU to 
be accessed in both big and little endian modes. 

When reading from the DCCU, the access latency is 
the same as reading data from the data cache — the 
data is ready for use as a source by the second 
instruction after the load. The first instruction after 
the load may use the data, but that instruction will 
experience a one-clock freeze before the data be- 
comes available. 



2.6 instruction Set 

Table 2.9 shows the complete set of instructions for 
the i860 XP microprocessor, grouped by function 
within processing unit. Refer to Section 10 for an 
algorithmic definition of each instruction. The in- 
struction set of the i860 XP microprocessor is fully 
upward compatible with that of the i860 XR micro- 
processor, extended in a few ways to better serve 
certain application domains. User-level software ap- 
plications written for the i860 XR microprocessor will 
run unmodified on the i860 XP microprocessor, but 
some supervisor code (for example, trap handlers) 
may need minor modifications. The i860 XR micro- 
processor instruction set has been extended with 
the following instructions: 

• Idio, stio: I/O load and store instructions 

• Idiht: Load interrupt instruction to perform an in- 
terrupt acknowledge cycle and read the interrupt 
vector. Used to emulate the Intel 486 interrupt 
acknowledge sequence. 

• scyc: A special-cycle instruction, used to gener- 
ate bus cycles that signal invalidation and syn- 
chronization of an external cache. 

• pfld.q: A pipelined, floating-point load of 128 bits. 

Table 2.8. CCU Addresses 









Little 


Big 


Mnemonic 


A11-A8 


A7-A4 


Endian 
A3-A0 


Endian 
A3-A0 


cbr__/ 


0000 


Oabc 


bOOO 


d10Q 


cget 


1111 


0110 


0000 


0100 


cnewcurr 


1111 


1100 


0000 


0100 


cstat 


1111 


1100 


1000 


1100 


cstatci 


1111 


1101 


0000 


0100 


cstatn 


1111 


1101 


1000 


1100 


eelm 


1111 


1110 


1 000 


1100 


ever 


1111 


1111 


1000 


1100 



NOTE: 

Variable i is a 4-bit index formed by A6-A3. Let its binary 
form be represented by the symbols abed. 
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Table 2.9. Instruction Set (1 of 2) 



Core Unit 


Mnemonic 


Description 


Load and Store Instructions 


Id.x 


Load integer 


st.x 


Store integer 


fld.y 


F-P load 


fst.y 


F-P store 


pfld.y 


Pipelined F-P load 


pst.d 


Pixel store 


Register to Register Move 


ixfr 


Transfer integer to F-P register 


Integer Arithmetic Instructions 


addu 


Add unsigned 


adds 


Add signed 


subu 


Subtract unsigned 


subs 


Subtract signed 


Shift Instructions 


shl 


Shift left 


shr 


Shift right 


shra 


Shift right arithmetic 


shrd 


Shift right double 


Logical Instructions 


and 


Logical AND 


andh 


Logical AND high 


andnot 


Logical AND NOT 


andnoth 


Logical AND NOT high 


or 


Logical OR 


orh 


Logical OR high 


xor 


Logical exclusive OR 


xorh 


Logical exclusive OR high 


Control-Transfer Instructions 


br 


Branch direct 


bri 


Branch indirect 


be 


Branch on CC 


bet 


Branch on CC taken 


bnc 


Branch on not CC 


bnc.t 


Branch on not CC taken 


bte 


Branch if equal 


btne 


Branch if not equal 


bla 


Branch on LCC and add 


call 


Subroutine call 


calli 


Indirect subroutine call 


intovr 


Software trap on integer overflow 


trap 


Software trap 



Floating-Point Unit 


Mnemonic 


Description 


Register to Register Move 


fxfr 


Transfer F-P to integer register 


F-P Multiplier Instructions 


fmul.p 


F-P multiply 


pfmul.p 


Pipelined F-P multiply 


pfmul3.dd 


3-Stage pipelined F-P multiply 


fmlow.p 


F-P multiply low 


frcp.p 


F-P reciprocal 


fsqr.p 


F-P reciprocal square root 


F-P Adder Instructions 


fadd.p 


F-P add 


pfadd.p 


Pipelined F-P add 


famov.r 


F-P adder move 


pfamov.r 


Pipelined F-P adder move 


fsub.p 


F-P subtract 


pfsub.p 


Pipelined F-P subtract 


pfgt.p 


Pipelined greater-than compare 


pfeq.p 


Pipelined equal compare 


fix.v 


F-P to integer conversion 


pfix.v 


Pipelined F-P to integer conversion 


ftrunc.v 


F-P to integer truncation 


Dual-Operation Instructions 


pfam.p 


Pipelined F-P add and multiply 


pfsm.p 


Pipelined F-P subtract and multiply 


pfmam.p 


Pipelined F-P multiply with add 


pfmsm.p 


Pipelined F-P multiply with subtract 


Long Integer Instructions 


fisub.z 


Long-integer subtract 


pfisub.z 


Pipelined long-integer, subtract 


fiadd.z 


Long-integer add 


pfiadd.z 


Pipelined long-integer add 


Graphics Instructions 


fzchks 


1 6-bit Z-buffer check 


pfzchds 


Pipelined 16-bit Z-buffer check 


fzchkl 


32-bit Z-buffer check 


pfzchkl 


Pipelined 32-bit Z-buffer check 


faddp 


Add with pixel merge 


pfaddp 


Pipelined add with pixel merge 


faddz 


Add with Z merge 


pfaddz 


Pipelined add with Z merge 


form 


OR with MERGE register 


pform 


Pipelined OR with MERGE register 
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Table 2.9. Instruction Set (2 of 2) 



Core Unit 


Mnemonic 


Description 


I/O Instructions 


Idio.x 


Load I/O 


stio.x 


Store I/O 


Idint.x 


Load interrupt vector 


Syst 


em Control Instructions 


flush 


Cache flush 


Id.c 


Load from control register 


st.c 


Store to control register 


lock 


Begin interlocked sequence 


unlock 


End interlocked sequence 


scyc.x 


Special bus cycles 


Assembler Pseudo-Operations 


Register to Register Move 


mov 


Integer move 


fmov.r 


F-P reg-reg move 


pfmov.r 


Pipelined F-P reg-reg move 


nop 


Core no-operation 


fnop 


F-P no-operation 


pfle.p 


Pipelined F-P less-than or equal 



The architecture of the i860 XP microprocessor uses 
parallelism to increase the rate at which operations 
may be introduced into the unit. Parallelism in the 
i860 XP microprocessor is not transparent; rather, 
programmers have complete control; over parallel- 
ism and therefore can achieve maximum perform- 
ance for a variety of computational problems. 



2.6.1 PIPELINED AND SCALAR OPERATIONS 

One type of parallelism used within the floating-point 
unit is "pipelining". The pipelined architecture treats 
each operation as a series of more primitive opera- 
tions (called "stages") that can be executed in par- 
allel. Consider just the floating-point adder as an ex- 
ample. Let A represent the operation of the adder. 
Let the stages be represented by A-|, A2, and A3. 
The stages are designed such that Aj+ 1 for one ad- 
der instruction can execute in parallel with Aj for the 
next adder instruction. Furthermore, each Aj can be 
executed in just one clock. The pipelining within the 
multiplier and graphics units can be described simi- 
larly, except that the number of stages may be differ- 
ent. 

Figure 2.14 illustrates three-stage pipelining as 
found in the floating-point adder (also in the floating- 
point multiplier when single-precision input operands 
are employed). The central columns of the table rep- 
resent the three stages of the pipeline. Each stage 
holds intermediate results and also (when intro- 
duced into the first stage by software) holds status 
information pertaining to those results. The table as- 
sumes that the instruction stream consists of a se- 
ries of consecutive floating-point instructions, allot 
one type (i.e. all adder instructions or all single-preci- 
sion multiplier instructions). The instructions are rep- 
resented as A, B, etc. The rows of the table repre- 
sent the states of the unit at successive clock cy- 
cles. Each time a pipelined operation is performed, 
the result of the last stage of the pipeline is stored in 
the destination register fdest, the pipeline is ad- 
vanced one stage, and the input operands of the 
operation are transferred to the first stage of the 
pipeline. 



Clock 


Instruction 


Pipeline 


Result 


Stage 1 


Stage 2 


Stage 3 


1 


A 


A 








2 


B 


B 


A 






3 


C 


C 


B 


A 




4 


D 


D 


C 


B 


A — ► fdestofD 


5 


E 


E 


D 


C 


B — ► fdest of E 


6 


F 


F 


E 


D 


C -* fdest of F 



Figure 2.14. Pipelined Instruction Execution 
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In the i860 XP microprocessor, the number of pipe- 
line stages ranges from one to three. A pipelined 
operation with a three-stage pipeline stores the re- 
sult of the third prior operation. A pipelined operation 
with a two-stage pipeline stores the result of the sec- 
ond prior operation. A pipelined operation with a 
one-stage pipeline stores the result of the prior oper- 
ation. 

There are four floating-point pipelines: one for the 
multiplier, one for the adder, one for the graphics 
unit, and one for floating-point loads. The adder 
pipeline has three stages. The number of stages in 
the multiplier pipeline depends on the precision of 
the source operands in the pipeline; it may have two 
or three stages. The graphics unit has one stage for 
all precisions. The load pipeline has three stages for 
all precisions. 

Changing the FZ (flush zero), RM (rounding mode), 
or RR (result register) bits of fsr while there are re- 
sults in either the multiplier or adder pipeline produc- 
es effects that are not defined. 



2.6.1.1 Scalar Mode 

In addition to the pipelined execution mode, the 
i860 XP microprocessor also can execute floating- 
point instructions in "scalar" mode. Most floating- 
point instructions have both pipelined and scalar 
variants, distinguished by a bit in the instruction en- 
coding. In scalar mode, the floating-point unit does 
not start a new operation until the previous floating- 
point operation is completed. The scalar operation 
passes through all stages of its pipeline before a 
new operation is introduced, and the result is stored 
automatically. Scalar mode is used when the next 
operation depends on results from the previous few 
floating-point operations (or when the compiler or 
programmer does not want to deal with pipelining). 



2.6.1.2 Pipelining Status Information 

Result status information in the fsr consists of the 
AA, Al, AO, AU, and AE bits, in the case of the ad- 
der, and the MA, Ml, MO, and MU bits, in the case of 
the multiplier. This information arrives at the fsr via 
the pipeline in one of two ways: 

1. It is calculated by the last stage of the pipeline. 
This is the normal case. 

2. It is propagated from the first stage of the pipe- 
line. This method is used when restoring the 
state of the pipeline after a preemption. When a 
store instruction updates the fsr and the the U bit 
being written into the fsr is set, the store updates 
the result status bits in the first stage of both the 
adder and multiplier pipelines. When software 



changes the result-status bits of the first stage of 
a particular unit (multiplier or adder), the updated 
result-status bits are propagated one stage for 
each pipelined floating-point operation for that 
unit. In this case, each stage of the adder and 
multiplier pipelines holds its own copy of the rele- 
vant bits of the fsr. When they reach the last 
stage, they override the normal result-status bits 
computed from the last-stage result. 

At the next floating-point instruction (or at certain 
core instructions), after the result reaches the last 
stage, the i860 XP microprocessor traps if any of the 
status bits of the fsr indicate exceptions. Note that 
the instruction that creates the exceptional condition 
is not the instruction at which the trap occurs. 

2.6.1.3 Precision in the Pipelines 

In pipelined mode, when a floating-point operation is 
initiated, the result of an earlier pipelined floating- 
point operation is returned. The result precision of 
the current instruction applies to the operation being 
initiated. The precision of the value stored in fdest is 
that which was specified by the instruction that initia- 
ted that operation. 

If fdest is the same as fsrd or fsrc2, the value being 
stored in fdest is used as the input operand. In this 
case, the precision of fdest must be the same as the 
source precision. 

The multiplier pipeline has two stages when the 
source operands are double-precision and three 
stages when they are single. This means that a pipe- 
lined multiplier operation stores the result of the sec- 
ond previous multiplier operation for double-preci- 
sion inputs and third previous for single-precision in- 
puts (except when changing precisions). 



2.6.1.4 Transition between Scalar and Pipelined 
Operations 

When a scalar operation is executed, it passes 
through all stages of the pipeline; therefore, any un- 
stored results in the affected pipeline are lost. To 
avoid losing information, the last pipelined opera- 
tions before a scalar operation should be dummy 
pipelined operations that unload unstored results 
from the affected pipeline. 

After a scalar operation, the values of all pipeline 
stages of the affected unit (except the last) are un- 
defined. No spurious result-exception traps result 
when the undefined values are subsequently stored 
by pipelined operations; however, the values should 
not be referenced as source operands. 
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For best performance a scalar operation should not 
immediately precede a pipelined operation whose 
fdest is nonzero. 



2.6.1.5 Pipelined Loads 

The pfld instruction is optimized for accesses that 
miss the data cache and transfer directly from mem- 
ory. Therefore, even when there is a data cache hit, 
a pfid may generate a bus cycle. The data from the 
internal cache is used only if it was modified. Other- 
wise, data is taken from the external bus, even if it 
resides in the on-board cache. 

The pfld FIFO can be extended externally, due to 
the facts that a pfld always generates a bus cycle 
and that such a cycle can be identified externally by 
the value on the CTYP pin. Software written for an 
externally-extended pfld pipeline must ensure that it 
does not pfld from a location that was modified in 
the data cache. When a pfld cache hit to a modified 
line occurs, the pfld pipeline length used by the 
i860 XP microprocessor is three stages. The modi- 
fied data from the cache is put into the internal 
three-stage data FIFO, and the third pfld instruction 
after the data cache hit will update its fdest register 
with the modified data. 



2.6.2 DUAL-INSTRUCTION MODE 

Another form of parallelism results from the fact that 
the i860 XP microprocessor can execute both a 



floating-point and a core instruction simultaneously. 
Such parallel execution is called dual-instruction 
mode. When executing in dual-instruction mode, the 
instruction sequence consists of 64-bit aligned in- 
struction pairs, with a floating-point instruction in the 
lower 32 bits and a core instruction in the upper 32 
bits. Table 2.9 identifies which instructions are exe- 
cuted by the core unit and which by the floating- 
point unit. 

Programmers specify dual-instruction mode either 
by including in the mnemonic of a floating-point in- 
struction a d. prefix or by using the Assembler direc- 
tives .dual . . . .enddual. Both of the specifications 
cause the D-bit of floating-point instructions to be 
set. If the i860 XP microprocessor is executing in 
single-instruction mode and encounters a floating- 
point instruction with the D-bit set, one more 32-bit 
instruction is executed before dual-mode execution 
begins. If the i860 XP microprocessor is executing in 
dual-instruction mode and a floating-point instruction 
is encountered with a clear D-bit, then one more pair 
of instructions is executed before resuming single-in- 
struction mode. Figure 2.15 illustrates two variations 
of this sequence of events: one for extended se- 
quences of dual-instructions and one for a single in- 
struction pair. 

Note that d.fnop cannot be used to initiate dual in- 
struction mode. 
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Figure 2.15. Dual-Instruction Mode Transitions (1 of 2) 
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Figure 2.15. Dual-Instruction Mode Transitions (2 of 2) 
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When a 64-bit dual-instruction pair sequentially fol- 
lows a delayed branch instruction in dual-instruction 
mode, both 32-bit instructions are executed. 



2.6.3 DUAL-OPERATION INSTRUCTIONS 

Special dual-operation floating-point instructions 
(add-and-multiply, subtract-and-multiply) use both 
the multiplier and adder units within the floating- 
point unit in parallel to efficiently execute such com- 
mon tasks as evaluating systems of linear equa- 
tions, performing the Fast Fourier Transform (FFT), 
and performing graphics transformations. 

The instruction classes pfam fsrd, fsrc2, fdest, 
pfmam fsrd, fsrc2, fdest (add and multiply), pfsm 
fsrd, fsrc2, fdest, and pfmsm fsrd, fsrc2, fdest 
(subtract and multiply) initiate both an adder opera- 
tion and a multiplier operation. Six operands are re- 
quired, but the instruction format specifies only three 
operands; therefore, there are special provisions for 
specifying the operands. These special provisions 
consist of: 

o Three special registers (KR, Kl, and T) that can 
store values from one dual-operation instruction 
and supply them as inputs to subsequent dual-op- 
eration instructions. 

— The constant registers KR and Kl can store 
the value of fsrd and subsequently supply 
that value to the multiplier pipeline in place of 
fsrd. 



— The transfer register T can store the last-stage 
result of the multiplier pipeline and subse- 
quently supply that value to the adder pipeline 
in place of fsrd. 

© A four-bit data-path control field in the opcode 
(DPC) that specifies the operands and loading of 
the special registers. 

1. Operand-1 of the multiplier can be KR, Kl, or 
fsrd. 

2. Operand-2 of the multiplier can be fsrc2, the 
last-stage result of the multiplier pipeline, or 
the last-stage result of the adder pipeline. 

3. Operand-1 of the adder can be fsrd, the 
T-register, the last-stage result of the multiplier 
pipeline, or the last-stage result of the adder 
pipeline. 

4. Operand-2 of the adder can be fsrc2 t the last- 
stage result of the multiplier pipeline, or the 
last-stage result of the adder pipeline. 

Figure 2.16 shows all the possible data paths sur- 
rounding the adder and multiplier. The DPC field in 
these instructions selects different data paths. Sec- 
tion 10 shows the various encodings of the DPC 
field. 

Note that the mnemonics pfam.p, pfsm.p, 
pfmam.p, and pfmsm.p are never used as such in 
the assembly language; these mnemonics are used 
here to designate classes of related instructions. 
Each value of DPC has a unique mnemonic associ- 
ated with it. 
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Figure 2.16. Dual-Operation Data Paths 



2-33 



a a a 



J860TM XP MICROPROCESSOR 



PRUILBfigDNAI^ 



2.7 Addressing Modes 

Data access is limited to load and store instructions. 
Memory addresses are computed from two fields of 
load and store instructions: isrrf and isrc2. 

1 . isrd either contains the identifier of a 32-bit inte- 
ger register or contains an immediate 1 6-bit ad- 
dress offset. 

2. isrc2 always specifies a register. 

Because either isrd or isrc2 may be null (zero), a 
variety of useful addressing modes result: 

offset + register Useful for accessing fields 
within a record, where register 
points to the beginning of the 
record. Useful for accessing 
items in a stack frame, where 
register is r3, the register used 
for pointing to the beginning of 
the stack frame. 

register + register Useful for two-dimensional ar- 
rays or for array access within 
the stack frame. 



register 
offset 



Useful as the end result of any 
arbitraryaddress calculation. 

Absolute address into the first 
or last 32K of the logical ad- 
dress space. 



In addition, the floating-point load and store instruc- 
tions may select autoincrement addressing. In this 
mode isrc2 is replaced by the sum of isrd and isrc2 
after performing the load or store. This mode makes 
stepping through arrays more efficient, because it 
eliminates one address-calculation instruction. 



3. Sets U to zero (supervisor mode). 

4. Sets IM to zero (interrupts disabled). 

5. If the processor is in dual instruction mode, it sets 
DIM; otherwise it clears DIM. 

6. If the processor is in single-instruction mode and 
the next instruction will be executed in dual-in- 
struction mode or if the processor is in dual-in- 
struction mode and the next instruction will be 
executed in single-instruction mode, DS is set; 
otherwise, it is cleared. 

7. The appropriate trap type bits in psr and epsr are 
set (IT, IN, IAT, DAT, FT, OF, IL, PI, PT, BEF, 
PEF). Several bits may be set if the correspond- 
ing trap conditions occur simultaneously. 

8. An address is placed in the fault instruction regis- 
ter (fir) to help locate the trapped instruction. In 
single-instruction mode, the address in fir is the 
address of the trapped instruction itself. In dual- 
instruction mode, the address in fir is that of the 
floating-point half of the dual instruction. If an in- 
struction or data access fault occurred, the asso- 
ciated core instruction is the high-order half of 
the dual instruction (fir + 4). In dual-instruction 
mode, when a data access fault occurs in the 
absence of other trap conditions, the floating- 
point half of the dual instruction will already have 
been executed (except in the case of the fxfr 
instruction). 

The processor begins executing the trap handler by 
transferring execution to virtual address 
OxFFFFFFOO. The trap handler begins execution in 
single-instruction mode. The trap handler must ex- 
amine the trap-type bits in psr (IT, IN, IAT, DAT, FT) 
and epsr (OF, IL, PT, PI, BEF, PEF) to determine the 
cause or causes of the trap. 



2.8 Traps and Interrupts 

Traps are caused by exceptional conditions detect- 
ed in programs or by external interrupts. Traps 
cause interruption of normal program flow to exe- 
cute a special program known as a trap handler. 
Traps are divided into the types shown in Table 2.10. 



2.8.1 TRAP HANDLER INVOCATION 

This section applies to traps other than reset. When 
a trap occurs, execution of the current instruction is 
aborted. Except for bus error and parity error traps, 
the instruction is restartable. The processor takes 
the following steps while transferring control to the 
trap handler: 

1. Copies U (user mode) of the psr into PU (previ- 
ous U). 

2. Copies IM (interrupt mode) into PIM (previous 
IM). 



2.8.2 INSTRUCTION FAULT 

This fault is caused by any of the following condi- 
tions. In all cases the processor sets the IT bit be- 
fore entering the trap handler. 

1. By the trap instruction. When trap is executed in 
dual-instruction mode, the floating-point compan- 
ion of the trap instruction is not executed before 
the trap is taken. 

2. By the intovr instruction. The trap occurs only if 
OF in epsr is set when intovr is executed. To 
distinguish between cases 1 and 2, the trap han- 
dler must examine the instruction addressed by 
fir. The trap handler should clear OF before re- 
turning. When intovr causes a trap in dual-in- 
struction mode, the floating-point companion of 
the intovr instruction is completely executed be- 
fore the trap is taken. 
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Table 2.10. Types of Traps 




Type 


Indication 


Caused by 


psr 


epsr 


fsr 


Condition 


Instruction 


Instruction 
Fault 


IT 


OF 

IL 
PT&PI 




Software traps 

Missing unlock 
Pipeline usage 


trap 
intovr 

Any 

Any scalar or pipelined 

instruction that uses a 

pipeline 


Floating 

Point 

Fault 


FT 




SE 

AO, MO 

AU, MU 

AI,MI 


Floating-point source 
exception 

Floating-point result 
exceptiori 

overflow 

underflow 

inexact result 


Any M- or A-unit except 
fmlow 

Any M- or A-unit except 
fmlow, pfgt, and pfeq. 
Reported on any F-P 
instruction, pst, fst, and 
sometimes fid, pf Id, and 
ixfr 


Instruction 
Access Fault 


IAT 






Address translation 
exception during instruction 
fetch 


Any 


Data 

Access 

Fault 


DAT 






Load/store address 
translation exception 
Misaligned operand address 
Operand address matches 
db register 


Any load/store 

Any load /store 
Any load/store 


Parity 
Error Fault 


IN 


PEF 




Parity error on data pins during bus read operation 
when PEN # pin active 


Bus Error Fault 


IN 


BEF 




External interrupt signal on BERR pin 


Interrupt 


IN 


INT 




External interrupt signal on INT pin 


Reset 


None 


PEF, BEF 




Hardware RESET signal 




3. By violation of lock/unlock protocol, explained 
below. (Note that trap and intovr should not be 
used within a locked sequence; otherwise, it 
would be difficult to distinguish between this and 
the prior cases.) 

4. By execution of an instruction that uses a pipeline 
when the PT bit of epsr is set. (Refer to section 
2.8.2.2.) 



2.8.2.1 Lock Protocol 

The lock protocol requires the following sequence of 
activities: 

1. lock 

2. Any load or store instruction. For compatibility 
with future processor generations, this should be 

"a load. 

3. unlock 

4. Any load or store instruction. For compatibility 
with future processor generations, this should be 
a store. 



There may be other instructions between any of 
these steps. The bus is locked after step 2, and re- 
mains locked until step 4. Step 4 must follow step 1 
by 30 instructions or less; otherwise, an instruction 
trap occurs. In case of a trap, IL is also set. If the 
load or store instruction of step 2 accesses a previ- 
ously unaccessed page (A = 0), the bus is locked 
briefly while the A bit is set, unlocked, then locked 
again to satisfy the lock instruction and start the 
locked sequence. 

2.8.2.2 Using PT and PI Bits 

The PI and PT bits are provided to help the trap 
handler avoid unnecessarily saving and restoring the 
pipelines (refer to the section "Pipeline Preemption" 
in the i860 Microprocessor Family Programmer's 
Reference Manual). 

Trap handlers that use PI or PT must initially exam- 
ine fsr. If a pending trap exists— that is, if the FTE 
(floating-point trap enable) bit is set and any of the 
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floating-point exception bits (Al, AO, AU, Ml, MO, 
MU) is active — the trap handler must save the pipe- 
lines. The i860 XP microprocessor, like the i860 XR 
microprocessor, may set an fsr exception bit before 
the floating-point trap is generated, and this pending 
trap relies on information in the pipeline. For exam- 
ple, an external interrupt might invoke the trap han- 
dler between the scalar floating-point instruction that 
produces an overflow and the next floating-point op- 
eration — the one that would cause a branch to the 
trap handler for the floating-point trap. 

If no pending trap exists, the handler can follow ei- 
ther of the following two methods: 

• Using both PT and PI: Upon invocation, the trap 
handler saves the state of PI and PT (in epsr), 
but does not save the pipes. If PI is found set 
(which means that the interrupted code needs 
the state information currently in the floating- 
point pipelines), the handler sets PT and clears PI 
(with a single st.c to epsr instruction), then con- 
tinues with trap processing. If the pipes are used 
during trap handling (even by a scalar instruc- 
tion), a trap will be generated with IT and PI set 
by hardware. The trap handler may then check PI 
and PT, and if both are set, clear PT, PI, and IT, 
save the pipes, set an indication that they were 
saved, and restart execution from the instruction 
that caused the trap. At the end of trap handling, 
the trap handler restores the pipes if they were 
saved, and restores PI and PT to their values be- 
fore the trap. This method avoids both saving and 
restoring the pipes, assuming that most trap han- 
dling sequences do not alter the pipes, and there- 
fore a trap for PT= 1 will not happen very often. 

• Using only PI: Another approach is to leave 
PT=0, using only the PI bit, which the processor 
sets each time a pipelined instruction or pfld is 
encountered (even if the floating point instruction 
is suppressed due to KNF = 1)/The trap handler 
saves PI, saves the pipes if PI is set, sets an indi- 
cation that they were saved, and clears PI. At the 
end of trap handling, the trap handler restores the 
pipes if they were saved, and restores PI to its 
value before the trap. With.this method/the pipes 
are sometimes saved and restored unnecessarily 
if the trap handler code does not use the pipes. 
This method is advised when it is known that the 
trap handler uses the pipes. 

2.8.3 FLOATING-POINT FAULT 

The floating-point fault is reported on floating-point 
Instructions, pst, fst, and sometimes fid, pfld, and 
ixfr. The floating-point faults of the i860 XP micro- 
processor support the floating-point exceptions de- 
fined by the IEEE standard as well as some other 
useful classes of exceptions. The i860 XP micro- 



processor divides these into two classes: source ex- 
ceptions and result exceptions. The numerics library 
supplied by Intel provides the IEEE standard default 
handling for all these exceptions. 

2.8.3.1 Source Exception Faults 

All exceptional operands, including infinities, denor- 
malized numbers and NaNs, cause a floating-point 
fault and set SE in the fsr. Source exceptions are 
reported on the instruction that initiates the opera- 
tion. For pipelined operations, the pipeline is not ad- 
vanced. 

SE is undefined for faults on fid, pfld, fst, pst, and 
ixfr instructions under these conditions: 

• In single-instruction mode, always. 

• In dual-instruction mode, when the companion in- 
struction is not a multiplier or adder operation. 

2.8.3.2 Result Exception Faults 

The result exceptions include: 

• Overflow. The absolute value of the rounded true 
result would exceed the largest positive finite 
number in the destination format. 

• Underflow (when FZ is clear). The absolute value 
of the rounded true result would be smaller than 
the smallest positive finite number in the destina- 
tion format. )"• • 

• Inexact result (when Tl is set). The result is not 
exactly representable in the destination format. 
For example, the fraction 1 / 3 cannot be precisely 
represented in binary form. This exception occurs 
frequently and indicates that some (generally ac- 
ceptable) accuracy has been lost. 

The point at which a result exception is reported de- 
pends upon whether pipelined operations are being 
used: 

• Scalar (nonpipelined) operations. Result ex- 
ceptions are reported on the next floating-point, 
fst.x, or pst.x (and sometimes fid, pfld, ixfr) in- 
struction after the scalar operation. When a trap 
occurs, the last-stage of the affected unit con- 
tains the result of the scalar operation. 

• Pipelined operations. Result exceptions are re- 
ported when the result is in the last stage and the 
next floating-point (and sometimes fid, pfld, ixfr) 
instruction is executed. When a trap occurs, the 
pipeline is not advanced, and the last-stage re- 
sults (that caused the trap) remain unchanged. 

When no trap occurs (either because FTE is clear or 
because no exception occurred), the pipeline is ad- 



2-36 



5nt@L 



186OTM XP MICROPROCESSOR 



(P^OMOIMBW 



vanced normally by the new floating-point operation. 
The result-status bits of the affected unit are unde- 
fined until the point that result exceptions are report- 
ed. At this point, the last-stage result-status bits (bits 
29..22 and 16..9 of the fsr) reflect the values in the 
last stages of both the adder and multiplier. For ex- 
ample, if the last-stage result in the multiplier has 
overflowed and a pfadd is started, a trap occurs and 
MO is set. 

For scalar operations, the RR bits of fsr report in 
which register the result was stored. RR is updated 
when the scalar instruction is initiated. The result ex- 
ception trap, however, occurs on a subsequent in- 
struction. Programmers must prevent intervening 
stores to fsr from modifying the RR bits. Prevention 
may take one of the following forms: 

o Before any store to fsr when a result exception 
may be pending, execute a dummy floating-point 
operation to trigger the result-exception trap. 

° Always read from fsr before storing to it, and 
mask updates so that the RR bits are not 
changed. 

For pipelined operations, RR is cleared; the result is 
in the last stage of the pipeline of the appropriate 
unit. The trap handler must flush the pipeline, saving 
the results and the status bits. 

In either pipelined or scalar mode, the trap handler 
must compute the result to be returned. In either 
case, the result delivered by the CPU has the same 
significand as the true result and has an exponent 
that is the low-order bits of the true result. The trap 
handler can inspect the delivered result, compute 
the result appropriate for that instruction (a NaN or 
an infinity, for example), and store the computed re- 
sult. If RR is nonzero, the trap handler must store 
the computed result in the register specified by RR; 
if RR is zero, it must load the last stage of the pipe- 
line with the computed result instead of the saved 
result. 

Result exceptions may be reported for both the ad- 
der and multiplier at the same time. In this case, the 
trap handler should fix up the last stage of both pipe- 
lines. 



2.8.4 INSTRUCTION ACCESS FAULT 

This trap occurs during address translation for in- 
struction fetches in any of these cases: 

® The address fetched is in a page whose P (pres- 
ent) bit in the page table is clear (not present). 



• The address fetched is in a supervisor mode 
page, but the processor is in user mode. 

• The address fetched is in a page whose PTE has 
A = 0, and the access occurs during a locked 
sequence (i.e. between lock and unlock). 

Note that several instructions are fetched at one 
time, either due to instruction prefetching or to in- 
struction caching. Therefore, a trap handler can 
change from supervisor to user mode and continue 
to execute instructions fetched from a supervisor 
page. An instruction access trap occurs only when 
the next group of instructions is fetched from a su- 
pervisor page (up to eight instructions later). If, in the 
meantime, the handler branches to a user page, no 
instruction access trap occurs. No protection viola- 
tion results, because the processor does not permit 
data accesses to supervisor pages while running in 
user mode. 

2.8.5 DATA ACCESS FAULT 

This trap results from an abnormal condition detect- 
ed during data operand fetch or store. Such an ex- 
ception can be due only to one of the following caus- 
es: 

• An attempt is being made to write to a page 
whose D (dirty) bit is clear. 

• A memory operand is misaligned (is not located 
at an address that is a multiple of the length of 
the data). 

• The address stored in the debug register is equal 
to one of the addresses spanned by the operand. 

• The operand is in a not-present page. 

• An attempt is being made from user level to write 
to a read-only page or to access a supervisor-lev- 
el page. 

• The operand is in a page whose PTE has A = 0, 
and the access occurs during a locked sequence 
(i.e. between lock and unlock). 

• Write protection (determined by epsr bit WP = 1) 
is violated in supervisor mode. 

When a data access trap is taken on a pipelined 
floating-point instruction that occurs immediately af- 
ter the load or store instruction that causes the trap, 
the destination register of the pipelined floating-point 
instruction may be partially updated. Correct execu- 
tion will occur when the trap handler resumes execu- 
tion after handling the DAT, because the pipelined 
floating-point instruction will then correctly update its 
destination register. 
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2.8.6 PARITY ERROR TRAP 

If the PEN# pin is active and the bus unit detects a 
parity error during a bus read operation, the proces- 
sor sets PEF and IN, then generates a trap. Further 
parity error traps are masked as soon as PEF is set. 
To reenable such traps, software must clear PEF 
and unfreeze BEAR by executing Id.c bear, rdest 

The interrupted program is not restartable. BS (bus 
or parity error trap in supervisor mode) is set by the 
i860 XP microprocessor when a parity error occurs 
while the processor is in supervisor mode. The oper- 
ating system can use this bit to decide, for example, 
whether to abort the process (user mode) or reboot 
the system (supervisor mode). 

2.8.7 BUS ERROR TRAP 

When external hardware asserts the BERR pin, the 
processor sets BEF (bus error flag) and IN (inter- 
rupt), and then traps. Further BERR traps are 
masked as soon as BEF is set by hardware. To 
reenable such traps, software must clear BEF and 
unfreeze BEAR by executing Id.c bear, rdest. 

BS (bus or parity error trap in supervisor mode) is set 
by the i860 XP microprocessor when a bus error oc- 
curs while the processor is in supervisor mode. The 
operating system can use this bit to decide, for ex- 
ample, whether to abort the process (user mode) or 
reboot the system (supervisor mode). 

2.8.8 INTERRUPT TRAP 

An interrupt is an event that is signaled from an ex- 
ternal source. If the processor is executing with in- 
terrupts enabled (IM set in the psr), the processor 
sets the interrupt bit IN in the psr and INT in the 
epsr, then generates an interrupt trap. 

Vectored interrupts are implemented by interrupt 
controllers and software. Software can use the Idint 
instruction to generate an interrupt acknowledge 
(INTA) cycle. This instruction generates a bus cycle 
with INTA cycle specifications, and places the data 
returned from the bus to the destination register. 
Tags are not checked in the data cache for hit, and 
the cycle is not burstable. 

The Intel 486 microprocessor generates two INTA 
cycles as a response to an interrupt and inserts four 
idle clocks in between. To generate an interrupt ac- 
knowledge sequence that is compatible with the 
Intel 486 microprocessor, the Idint instruction se- 
quence documented in section 5.1.4 should be exe- 
cuted. 



2.8.9 RESET TRAP 

When the i860 XP microprocessor is reset, execu- 
tion begins in single-instruction mode at virtual ad- 
dress OxFFFFFFOO. This is the same address as for 
other traps. The reset trap can be distinguished from 
other traps by the fact that no trap bits are set. The 
instruction cache is flushed. The bits DPS, BL, and 
ATE in dirbase are cleared. CS8 is initialized by the 
value at the INT pin at the end of reset. The read- 
only fields of the epsr are set to identify the proces- 
sor, while the IL, WP, and PBM bits are cleared. The 
bits U, IM, BR, and BW in psr are cleared, as are the 
trap bits FT, DAT, IAT, IN, and IT. All other bits of 
psr and all other register contents are undefined. 
Refer to Table 2.11 for a summary of these initial 
settings. 

The software must ensure that the control registers 
are properly initialized before performing operations 
that depend on the values of those registers. 

Reset code must initialize the floating-point pipeline 
state to zero with floating-point traps disabled to en- 
sure that no spurious floating-point traps are gener- 
ated. 

After a RESET the i860 XP microprocessor starts 
execution at supervisor level (U = 0). Before branch- 
ing to the first user-level instruction, the RESET trap 
handler or subsequent initialization code has to set 
PU and a trap bit so that an indirect branch instruc- 
tion will copy PU to U, thereby changing to user lev- 
el. 



2.9 Debugging 

The i860 XP microprocessor supports debugging 
with both data and instruction breakpoints. The fea- 
tures of the i860. XP microprocessor architecture 
that support debugging include: 

© db (data breakpoint register), which permits 
specification of a data address that the i860 XP 
microprocessor will monitor. 

© BR (break read) and BW (break write) bits of the 
psr, which enable trapping of either reads or 
writes (respectively) to the address in db. 

© DAT (data access trap) bit of the psr, which al- 
lows the trap handler to determine when a data 
breakpoint was the cause of the trap. 

® trap instruction that can be used to set break- 
points in code. Any number of code breakpoints 
can be set. The values of the isrd and isrc2 
fields help identify which breakpoint has oc- 
curred. 

© IT (instruction trap) bit of the psr, which allows 
the trap handler to determine when a trap 
instruction was the cause of the trap. 



2-38 



iniel. 



i860TM XP MICROPROCESSOR 



PKHUMOGMiV 



Table 2.1 1. Register and Cache Values after Reset 



Registers 


Initial Vaule 


Integer Registers 
Floating-Point Registers 
psr 

epsr 

db 

dirbase 

fir 

fsr 

bear 

p3-p0 

ccr 

KR.KI.T, MERGE 

NEWCURR 

STATUS 


Undefined 

Undefined 

U, IM, BR, BW, FT, DAT, IAT, IN, IT = 0; 

others are undefined 

IL, WP, PBM, BE, PT = 0; BEF, PEF = 1; 

Processor Type, Stepping Number, DCS, 

SO are read only; others are undefined 

Undefined 

DPS, BL, LB, ATE = 0; others are undefined 

Undefined 

Undefined 

Undefined 

Undefined 

CO, DO = 0; others are undefined 

Undefined 

Undefined 

InLoop, Nested, Detached = 


Caches 


Initial Value 


Instruction Cache 
Data Cache 
TLB 


All entries invalid 
All entries invalid 
All entries invalid 




3.0 ON-CHIP CACHES 

By holding data, instructions, and address transla- 
tion on-chip, the caches of the i860 XP microproces- 
sor provide the following advantages: 

1 . Low chip count for the CPU subsystem. 

2. Wide processor-to-cache path: 1.6 bytes for data, 
8 bytes for instructions. 

3. Fast access without requiring much additional 
high-speed design in the system. The fast 
(50 MHz) cache-access circuitry is hidden on 
chip; the external bus can respond more slowly 
without significantly, degrading performance. 



3.1 Address Translation Caches 

The i860 XP microprocessor allows both four Kbyte 
and four Mbyte page sizes, and a separate transla- 
tion look-aside buffer (TLB) is used to cache ad- 
dress translation information for each page size. The 
TLB for four-Kbyte pages (Figure 3.1) has 64 entries, 
and the TLB for four-Mbyte pages (Figure 3.2) has 
16 entries. Both are four-way set associative. The 
TLBs function when paging is enabled. When a page 
is first accessed, its translation information is saved 
in the appropriate TLB along with other page attri- 
butes, such as access rights and cacheability. Every 
address translation operation looks up the virtual ad- 
dress simultaneously in both TLBs. Only if the nec- 



essary paging information is not in either of the 
caches must the paging tables in memory be refer- 
enced. Both TLBs employ a random replacement al- 
gorithm to choose which of the four ways to replace. 

If an instruction's virtual address is found in the in- 
struction cache, the virtual address is not translated, 
and code access rights are not verified. However, 
when an instruction's virtual address is not found in 
the cache, address translation does occur, and all 
access rights are verified. The virtual addresses of 
data are always translated, and access rights are 
always verified. 

The i860 XP microprocessor requires simultaneous 
access to data and instruction caches, but the TLBs 
can service only one address translation at a time. 
Data address translation has higher priority in the 
TLBs than instruction address translation, if both are 
required at the same time. 

Any data or instruction access fault halts address 
translation at once, and the TLB is not updated. If a 
directory read causes an access fault, the page ta- 
ble is not read at all. 

If the paging unit generates a fault (in setting the D 
bit for the first write to a nondirty page, for example), 
the corresponding entry is deleted from the TLB. 
Therefore, software does not need to invalidate the 
TLB entry in response to DAT or IAT faults. 
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NOTES: 
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Figure 3.1. 4K TLB Organization 
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Figure 3.2. 4M TLB Organization 
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If TLB replacement is initiated during a locked se- 
quence generated by the lock instruction and if an- 
other locked sequence has to be executed to set the 
A-bit, the paging unit generates an access fault. This 
helps external hardware implement "locking by ad- 
dress" by preventing generation of nested lock se- 
quences. 



3.2 Internal Instruction and Data 
Caches 

The i860 XP microprocessor has separate data and 
instruction caches on-chip. Having separate caches 
for instructions and data allows simultaneous cache 
look-up. Up to two instructions and 128 bits of data 
can be accessed simultaneously from these caches. 
The data and instruction caches hold 16 Kbytes 
each. A line can be filled from memory with a four- 
transfer burst. 

The caches are fully transparent to applications soft- 
ware. Snooping (address monitoring) is designed 
into both instruction and data caches, to maintain 
cache consistency in multiprocessor systems. 

Each cache has two sets of tags: virtual tags used 
for internal access, and physical tags used for 



snooping. Figure 3.3 shows how the bits of both vir- 
tual and physical addresses are mapped for cach- 
ing. The presence of both virtual and physical tags 
supports aliasing, a situation in which the TLBs as- 
sociate a single physical address with two or more 
virtual addresses. 

Any area of memory can be cached, although both 
software and hardware can disallow certain areas 
from being cached — software by setting the CD bit in 
their page table entries; hardware by deasserting the 
KEN# signal for bus cycles with addresses that fall 
in those areas. (Data reads from the two four-Kbyte 
pages pointed to by the CCUBASE field of ccr are 
not cached (and the CACHE # signal is inactive), if 
the DCCU is activated by setting CO of the ccr 
register. This is independent of the value of KEN#.) 
When both software and hardware agree that a re- 
quested datum is cacheable, the i860 XP microproc- 
essor fetches an entire 32-byte line and places it 
into the appropriate cache. Cache line fills are gen- 
erated only for read misses, not for write misses. A 
store that misses the cache does not copy the 
missed line into cache from memory, but rather 
posts the datum in a write buffer, then sends it to the 
external bus when the bus is available. 
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Figure 3.3. Cache Address Usage 
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3.2.1 DATA CACHE 

Figure 3.4 shows the organization of the data cache. 
The data cache has two status bits per physical tag 
and one validity status bit for the virtual tag. A virtual 
tag hit is possible only when the validity bit of the 
virtual tag is set and the state of the physical tag is 
M, E, orS. 

Aliasing support is built into the cache look-up algo- 
rithm. Even though a physical line may be aliased, 
the processor never enters the line twice in the data 
cache. If a virtual address is not found among the 
virtual tags in the data cache, a bus cycle is initiated 
(except a read is not issued at this time if the bus 
pipeline is full) and, at the same time, the physical 
tags are searched for the physical address (which by 
this time has been retrieved from the paging unit). 
For reads, if the physical address is found, the data 
returned from the bus is ignored, on-chip data is 
used, and the virtual tag is replaced with the new 
one. For writes, if, a virtual address is not found, the 
write is issued on the bus and memory is updated. If 
the physical address is found, the line in cache is 
updated, and the virtual tag is replaced with the new 
one. However, the cache state (M, E, or S) of the 
physical-address tag does not change when the vir- 
tual tag is overwritten. 

Note that the BE (big endian) bit of epsr has no 
influence on data cache behavior. Data items are 
kept in cache in exactly the same ordering as in ex- 
ternal memory. Byte-shifting operations invoked by 
the BE bit upon loads and stores occur at the input 
to the register files only. 



3.2.1.1 Data Cache Update Policies 

To minimize bus traffic, a write-back policy is normal- 
ly used. The write-back policy (also called copy-back 
and deferred-write) reduces bus traffic by eliminating 



many unnecessary writes. Writes to a line in the 
cache are not immediately forwarded to main mem- 
ory; instead, they are accumulated in the cache. The 
modified cache line is written to main memory only 
when its cache space is needed for other data, 
when the modified data is needed by another proc- 
essor, or when a flush procedure is executed. 

Under the write-back policy, a write that hits the 
cache utilizes it for two cycles (one to check the 
virtual tags for hit, another to update the cache line). 
However, the cache pipeline allows successive 
store hits to operate at one per cycle. The proces- 
sor's internal write buffers can hold two successive 
stores, preventing a freeze upon store miss. 

Under a write-through policy, a write request to a line 
in the cache triggers updates to both cache and 
main memory. An address decoder, for example, 
can select the write-through policy for writes to video 
RAM, where it is necessary that writes be seen on 
the video display. Software, by setting the WT page- 
table bit, can select the write-through policy for spe- 
cific areas of memory— those that are used for inter- 
processor message queues, for example. 

A write-once policy combines write-through with 
write-back. Write-through is employed for the first 
write to a cache line, while subsequent writes to the 
same line follow the write-back policy. Write-once is 
valuable in multiprocessor systems to maintain 
cache consistency with the least possible bus traffic. 
The first write broadcasts to other processor nodes 
the fact that a line has been modified. Write-once is 
also used if a second-level cache is attached to the 
i860 XP microprocessor to maintain consistency be- 
tween the first- and second-level caches. 

The external system can dynamically change the up- 
date policy (write-back, write-through, write-once) of 
the i860 XP microprocessor with each cache line. 



NOTES: 

M Modified 
E Exclusive 
S Shared 
I Invalid 
V Validity 
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Figure 3.4. Data Cache Organization 
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3.2.2 INSTRUCTION CACHE 

Figure 3.5 shows the organization of the instruction 
cache. The instruction cache has one validity bit that 
is common to both virtual and physical tags. Aliasing 
support for instructions consists not simply of chang- 
ing the virtual tag, but rather fetching a line whenev- 
er a virtual tag miss occurs. If the physical address 
already exists in the instruction cache, its line and its 
tags are overwritten. So, even though a physical line 
may be aliased, the processor never enters the line 
twice in the instruction cache. 



3.2.3 CACHE REPLACEMENT ALGORITHM 

The data, instruction, and address-translation 
caches all use similar algorithms to choose which of 
the four cache blocks will be overwritten when a 
miss causes a line fetch. 

First, the first invalid line (if any) in a set of four is 
replaced (in the order 0, 1 , 2, 3). When there are no 
more invalid lines in a set, a pseudorandom replace- 
ment algorithm chooses which valid lines to replace. 
The algorithm is controlled by counters inside the 
chip. RESET initializes these counters to zero, so 
that the "randomness" is deterministic and two 
i860 XP CPUs executing the same code on identical 
boards have exactly the same series of cache hits, 
misses, and replacements. 



Setting ITI to invalidate the caches and TLBs also 
resets the counters used to select the set used for 
cache line replacement. This brings the i860 XP mi- 
croprocessor cache-replacement mechanism to a 
known state without resetting the whole chip. 

When the flush instruction is used to write back 
modified lines in the data cache, the flush routine 
must alter the RC (replacement control) field of 
dirbase. Therefore, replacement is not random. In- 
stead, the block (or "way") replaced is the one se- 
lected by the RB (replacement block) field of 
dirbase. 



3.2.4 CACHE CONSISTENCY PROTOCOL 

The 186OTM XP Microprocessor implements cache 
consistency via its use of a MESI (Modified, Exclu- 
sive, Shared, Invalid) protocol. 



3.2.4.1 Data Cache States * 

Each line of the data cache of the i860 XP micro- 
processor can be in one of the states defined in Ta- 
ble 3.1. Note that the instruction cache of the 
i860 XP only implements the "SI" part of the MESI 
protocol, because the instruction cache is not writa- 
ble. 
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Figure 3.5. Instruction Cache Organization 
Table 3.1. MESI Cache Line States 



Cache Line State: 


Modified 


E 
Exclusive 


S 
Shared 


I 
Invalid 


This cache line is valid? 


Yes 


Yes 


Yes 


No 


The memory copy is . . . 


'..". .out of date 


. . .valid 


. . .valid 


— 


Copies exist in other caches? 


No 


No 


Maybe 


Maybe 


A write to this line .... 


. . . does not go 
to bus 


. . . does not go 
to bus 


... goes to bus 
and updates 
the cache 


. . . goes 
directly to bus 
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Table 3.2. Internally Initiated Cache State Transitions 



State 


Next State after Read 


Next State after Write* 


1 

S 

E 
M, 


lfWB/WT# = 1;E;elseS 
Line fill 

s 

. E ' 
M 


Write-through 

■'"I 

Write-through 

lfWB/WT# = 1,E;elseS 

M 

M 



NOTE: 

* "Write" does not include write-backs due to replacement. Those can only cause an M to I 
transition. 



The state of a cache line can change as the result of 
either internal or external activity related to that line. 
Table 3.2 presents the line state transitions that re- 
sult from internal activity of the i860 XP microproces- 
sor in the data cache. 

External cache-consistency support is provided 
through inquiry cycles. Inquiry cycles are initiated by 
other processors in a multiprocessor system to 
check whether an address is cached in the internal 
cache of the i860 XP microprocessor. Table 3.3 
shows the line state transitions initiated by inquiry 
cycles. 

Table 3.3. Inquiry-Initiated 
Cache State Transitions 



State 


INV = 


INV=1 


I 
S 

E 
M 


I 
S 

s 

S; write back the line 


I 

I 

I 

I; write back the line 



3.2.4.2 Write-Once Policy 

A write-once cache policy can be implemented 
through use of the WB/WT# input pin. The signal 
on this pin is sampled in both read and write cycles. 
A read miss causes a line to enter either S or E after 
the line fill. If WB/WT# is sampled LOW at the time - 
of NA# or the first BRDY# activation, the line en- 
ters S state, forcing the next write hit to this line to 
show up on the bus. If WB/WT# is sampled HIGH, 
the line enters E state. In write-through cycles, the 
state of a line is changed from S to E when WB/ 
WT# is sampled HIGH, so that subsequent writes 
will not be written through to the bus. Thus, if this 
signal is driven LOW on read cycles and HIGH on 
write cycles, a write-once cache policy is implement- 
ed. The easiest way to implement write-once (in sys- 
tems not using the 82495XP cache controller) is to 
tie this pin to the W/R# output of the processor. 



If the WT bit in the page table entry is set, the 
i860 XP microprocessor ignores the WB/WT# sig- 
nal for the cycles that hit that page and always per- 
forms a write-through. In other words, hardware can- 
not override software's selection of the write- 
through policy. 

3.2.4.3 Locked Access 

Locked accesses oxe those data loads and stores 
that occur after a lock instruction up to and including 
the first load or store after the corresponding unlock 
instruction. 

State transitions for locked accesses differ from 
those in Table 3.2 in ways that guarantee that 
locked accesses are seen by all processors in the 
system. Any locked load or store generates both a 
cache look-up and an external bus cycle, regardless 
of cache hit or miss. 

1. In a locked read: 

a. If the required data is not found in the cache, 
the data from the bus is used. The data is 
placed in the cache if it is cacheable and 
KEN# is also asserted. 

b. If the required data is found in an unmodified 
(E or S) state, the data from the bus is used. 

c. If the data is found in the cache in a modified 
(M) state, the cached data is used, and the 
bus data is ignored, as long as no inquiry 
write-back occurs before the BRDY# of the 
bus cycle. If, however, an intervening inquiry 
write-back changes the line to S or I state, the 
bus data is used. 

2. A locked store is forced through the cache and 
issued on the bus. No more data accesses occur 
until the last BRDY# for the store. If the store 
hits the internal cache, the cache update is done 
after the last BRDY# from the bus. Note that the 
line written by a locked store remains in M state 
in spite of the write-through to the bus, because 
the length of the write-through is less than the 
line size of 32 bytes. 
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Locked accesses are totally serializing in the sense 
that: 

1. All loads and stores that precede the lock 
instruction are issued on the bus (if they miss the 
cache) before the first locked access is issued. 
The locked access can be issued before the last 
BRDY# of the prior cycle if NA# is activated in 
response to the prior cycle. 

2. No load or store after the last locked access is 
issued internally or on the bus until the final 
BRDY# for all locked accesses. 

To maximize performance, instruction fetches during 
the locked sequence are not serializing. When NA# 
invokes pipelining, instruction fetches may be issued 
while locked data fetches or stores remain on the 
bus. 



3.3 Internal Cache Consistency 

Both the instruction and the data caches can be 
snooped by externally generated inquiry cycles, and 
the result of the look-up is presented on the HIT# 
and HITM# output pins. These inquiry cycles help 
maintain consistency with caches of other proces- 
sors. However, software must take care not to cre- 
ate inconsistencies such as the following among the 
internal caches (including the TLBs): 

1 . Changing the address space while leaving virtual- 
address tags from the prior space in the instruc- 
tion or data cache. 

2. Changing instructions in memory (or in the data 
cache) without changing them in the instruction 
cache. 

3. Changing page table information in memory (or in 
the data cache) without changing the same infor- 
mation in the TLBs. 

Under certain circumstances, such as I/O refer- 
ences, self-modifying code, page-table updates, or 
shared data in a multiprocessing system, it is neces- 
sary to bypass, to invalidate, or to flush the caches. 
The i860 XP microprocessor provides the following 
methods for doing this: 

* Bypassing Instruction and Data Caches. 

1. If deasserted during cache-miss processing, 
the KEN# pin disables instruction and data 
caching of the referenced data. 

2. If the CD bit of the associated page table is 
set, caching of a page is disabled. The value of 
the CD bit is output on the PCD pin for use by 
external caches. 



3. If the WT bit of the associated page table is 
set, caching is not disabled, but writes pass 
through the cache. The value of the WT bit is 
output on the PWT pin for use by external 
caches. (Note that WT does not affect policy 
for the instruction cache, because the instruc- 
tion cache is not writable. However, when an 
instruction from a page having the WT bit of 
the PTE set is placed in the data cache, the 
write-through policy applies just as for a data 
page.) 

o Invalidating Cache Entries. Storing to the 
dirbase register with the ITI bit set invalidates 
each line of the instruction and address-transla- 
tion caches. In the data cache, it invalidates the 
virtual tags, but not the physical tags. 

° Flushing the Data Cache. The data cache is 
flushed by a software routine that uses the flush 
instruction. The flush instruction speeds up write- 
backs. The same effect (writing back modified 
lines) can be achieved with the load instruction 
Id.l, but this would be more than twice as slow — 
the load must first do four bus transfers to get 
new data, then write back the modified line. The 
flush instruction causes the write-backs without 
requiring a read from external memory to replace 
the modified line. 

3.3.1 ADDRESS SPACE CONSISTENCY 

In a multitasking virtual-address system, the operat- 
ing system may intentionally employ aliasing, where 
several processes use the same physical memory 
while accessing it with different virtual addresses. 
When the operating system switches control from 
one process to the next, it changes the DTB field of 
the dirbase to point to a different page directory that 
defines the new address space. When this happens, 
all caches must be invalidated: the TLBs, so that the 
new page directory is read into the TLBs; the data 
and instruction caches, so that virtual addresses 
from the new space don't accidently match cached 
virtual addresses from the old space. 

The caches are invalidated by setting the ITI bit 
when writing to dirbase. Invalidating the instruction 
cache invalidates both the physical and the virtual 
tags, because the instruction cache has one status 
(valid) bit, which is common to both physical and 
virtual tags. In the data cache, setting ITI does not 
invalidate physical tags. However, any modified lines 
will eventually be written back when their space is 
required for lines from the new address space or 
when external agents on the bus express a need for 
the modified data via inquiry cycles. 
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The caches are invalidated by setting the ITI bit 
when writing to dirbase. Note, however, that the op- 
erating system code that flushes the caches must 
be present during the flushing. Typically this code 
has the same virtual address for all processes. 

NOTE: 

The mapping of the page(s) containing the cur- 
rently executing instruction, the next six in- 
structions, and any data referenced by these 
instructions should not be different in the new 
page tables when the DTB is changed. 

Enabling or disabling address translation (via the 
ATE bit) is similar to changing the DTB, in that the 
address mapping is changed. The virtual tags in the 
data and instruction cache must be invalidated prior 
to changing ATE. 

3.3.2 INSTRUCTION CACHE CONSISTENCY 

When software modifies a page containing instruc- 
tions (as when a debugger replaces an instruction 
with the trap instruction to set a breakpoint), the in- 
struction cache can become inconsistent for any of 
the following reasons: 

© Because the data cache uses a write-back policy, 
changes to cached instruction pages do not im- 
mediately update memory. 

© Changes to instructions do not automatically up- 
date the instruction cache. 

v '• Instruction cache misses are not checked in the 
data cache. 

Software must ensure that modified lines containing 
instructions are written to main memory before the 
instruction cache tries to read them. There are two 
methods for this: 

1. Flush the data.cache using the flush instruction. 
Note that to make the instruction cache consist- 
ent with the data cache, the data cache must be 
flushed before invalidating the instruction cache. 

2. Mark all instruction pages as WT (write through) 
so that modifications to instructions are immedi- 
ately written to memory. This is the better alterna- 
tive. 

In either case, the instruction cache must be invali- 
dated (by a store to dirbase with ITI set) after a 
code page has been modified, so that the updated 
instructions will be read from memory. 



3.3.3 PAGE TABLE CONSISTENCY 

When the operating system modifies page tables or 
directories, the TLBs can become inconsistent with 
the modifications for any of the following reasons: 

° Because the data cache uses a write-back policy, 
updates to cached page tables do not immediate- 
ly update memory. 

® Changes to page tables do not automatically up- 
date the TLB. 

® The i860 XP microprocessor searches only exter- 
nal memory for page directories and page tables 
in the translation process. The data cache is not 
searched. (Data is not transferred from the data 
cache to the TLBs during TLB replacement cy- 
cles.) 

Software must ensure that modified lines containing 
page table entries are written to main memory be- 
fore the paging unit tries to read them. There are two 
methods for this: 

1. Keep page tables and directories in noncachea- 
ble memory or write-through pages. 

2. Flush the data cache using the flush instruction. 

The processor itself invalidates the affected TLB en- 
try, when a trap is triggered by the need to set the A 
or D bit. In other cases, after a page table or directo- 
ry has been modified, software must invalidate the 
TLBs (by a store to dirbase with ITI set) so that the 
updated entries will be read from memory. 

The data cache does not need flushing if the pro- 
gram is modifying only the P, U, W, A, or D bits of a 
PTE (as long as the page frame address is not 
changed and the PTE itself is not in the data cache.) 
The i860 XP CPU does not use the TLB for cache 
line write-backs; it writes to the address in the physi- 
cal tag. 

Thus, a trap handler can service a data access trap 
for D-bit zero merely by setting D= 1. When setting 
the P or A bits, there is no need to invalidate or flush 
any caches, because the processor does not load 
entries into the TLB that have P = or A = 0. 

Two potential TLB inconsistencies are avoided auto- 
matically by the i860 XP microprocessor. 

1 . If the paging unit issues a write cycle (to set the A 
bit, for example), this cycle is snooped by the 
data cache for invalidation. 

2. Any TLB entry that causes a DAT or IAT is auto- 
matically invalidated. 
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3.3.4 CONSISTENCY OF CACHEABILITY 

Normally, an operating system ensures that the 
page attributes (CD and WT) of a memory access 
are consistent with the cache contents. However, 
the operating system can fail to maintain consisten- 
cy by the following actions: 

• Changing the CD or WT bits while related lines 
are in the cache. 

• Aliasing a physical address with virtual addresses 
that have differing CD or WT bits. 

In these situations, the i860 XP microprocessor 
gives priority to cache state. For example: 

1. If a read or write request is to a noncacheable 
page (CD= 1), but the data (or code) is found in 
cache, the request is satisfied by the cache, and 
no external cycle is issued. 

2. If the physical address of a read or write request 
hits in the cache but the virtual address misses, 
the virtual tag is overwritten by the new virtual 
address, but the CD bit of the new virtual address 
is ignored. 

3. If a store to a write-through page '(WT= 1) hits a 
cache line in E or M state, no write-through cycle 
is issued; only the cache is updated. 

3.3.5 LOAD PIPE CONSISTENCY 

The pfld (pipelined floating-point load) instruction fa- 
cilitates transfer of data from memory to registers, 
and avoids placing data in the data cache. When 
large amounts of data are used, pfld allows the pro- 
grammer to keep rarely-used data out of the cache. 
The i860 XP microprocessor ensures consistency 
between cached data and pfld references. It checks 
the data cache and, upon a data cache hit to a modi- 
fied line, forwards data from cache into the three- 
stage pfld pipeline. 



3.3.6 SUMMARY 

Table 3.4 summarizes flush and invalidation require- 
ments, assuming that WT is set in the PTEs of in- 
struction and page-table pages: 

Table 3.4. Summary of 
Cache Flushing And Invalidation 





Flush 


Invalidate 


Action 


Data 


Caches 




Cache 


(ITI) 


Setting A 


No 


No 


Setting P 


No 


No 


Clearing P 


No 


Yes 


Setting D 


No 


No 


Changing protection (U,W) 


No 


Yes 


Setting CD or WT 


Yes 


Yes 


Changing PFA in a usedC) PTE 


No 


Yes 


Changing dirbase DTB 


No 


Yes 


Changing dirbase ATE 


No 


Yes 


Changing epsr WP 


No 


No 


Setting ccr DO and CO 


Yes(2) 


Yes(2) 


Modifying code 


No(3) 


Yes 




NOTES: 

1. "Used" means a PTE that at some past time had P set. 

2. If data from either of the CCU pages could have been 
cached. 

3. Assuming all instructions and their page directories and 
page tables are in write-through or noncacheable pages. 



4.0 HARDWARE INTERFACE 

In the following description, of hardware interface, 
the # symbol at the end of a signal name indicates 
that the active or asserted state occurs when the 
signal is at a low voltage. When no # is present after 
the signal name, the signal is asserted when at the 
high voltage level. 



4.1 Pins Overview 

Figure 4.1 identifies functional groupings of the pins. 
Table 4.1 lists every pin by its identifier, gives a brief 
description of its function, and lists some of its char- 
acteristics. All output pins are tristate, except BREQ, 
HIT#, HITM#, HLDA, LOCK#, and PCHK#. 
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Table 4.1 


. Pin Summary 




Pin 




Active 


When Floated 


Internal 


ID 


Name 


Level 


Synch/Asynch 


Resistor 


Output Pins 


ADS# 


Address Status 


LOW 


HLDA, clock after BOFF# 




BE7#-BE0# 


Byte Enable 


LOW 


HLDA, BOFF# 




BREQ 


Bus Request 


HIGH 






CACHE # 


Cache 


LOW 


HLDA, BOFF# 




CTYP 


Cycle Type 


HIGH 


HLDA, BOFF# 




D/C# 


Data/Code 




HLDA, BOFF# 




HIT# 


Snoop Hit Cache 


LOW 






HITM# 


Snoop Hit Modified Line 


LOW 






HLDA 


Hold Acknowledge 


HIGH 






KB0,KB1 


Cache Block 


HIGH 


HLDA, BOFF# 




LEN 


Length 


HIGH 


HLDA,BOFF# 




LOCK# 


Address Lock 


LOW 






M/IO# 


Memory/IO 




HLDA, BOFF# 




NENE# 


Next Near 


LOW 


HLDA, BOFF# 




PCD 


Page Cache Disable 


HIGH 


HLDA,BOFF# 




PCHK# 


Parity Check 


LOW 






PCYC 


Page Cycle 


HIGH 


HLDA, BOFF# 




PWT 


Page Write-Through 


HIGH 


HLDA,BOFF# 




TDO 


Test Output 




Nonscan Mode 




W/R# 


Write/Read 




HLDA, BOFF# 




Input/Output Pins 


A31-A3 


Address 


HIGH 


AHOLD, HLDA, BOFF# 




D63-D0 


Data 


HIGH 


HLDA, BOFF# 




DP7-DP0 


Data Parity 


HIGH 


HLDA,BOFF# 




Input Pins 


AHOLD 


Address Hold 


HIGH 


Synch 




BERR 


Bus Error 


HIGH 


Synch 




BOFF# 


Back-Off 


LOW 


Synch 




RSRVD# 


Intel Reserved 








BRDY# 


Burst Ready 


LOW 


Synch 




BYPASS # 


Intel Reserved 


LOW 






CLK 


Clock 








RESET 


Reset 


HIGH 


Asynch 




EADS# 


External Address Status 


LOW 


Synch 




EWBE# 


External Write Buffer Empty 


LOW 


Synch 




FLINE# 


Flush Line 


LOW 


Synch 




HOLD 


Bus Hold 


HIGH 


Synch 




INT/CS8 


Interrupt/Code-Size 8 


HIGH 


Asynch 




INV 


Invalidate 


HIGH 


Synch 




KEN# 


Cache Enable 


LOW 


Synch 




NA# 


Next Address 


LOW 


Synch 




PEN# 


Parity Enable 


LOW 


Synch 




TCK 


Test Clock 








TDI 


Test Data Input 




Synch 


Pull-up 


TMS 


Test Mode Select 




Synch 


Pull-up 


TRST# 


Test Reset 


LOW 


Asynch 


Pull-up 


WB/WT# 


Write-Back/Write-Through 




Synch 




SPARE 


Intel Reserved 
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The pins D/C#, W/R#, and M/IO# define bus cy- 
cle types. They are summarized in Table 4.2. For 
data transfers to or from memory, two additional 
pins, CTYP and PCYC, provide further information 
regarding the type of transfer, as shown in Table 4.3. 
Table 4.4 shows how the LEN and CACHE # pins 
determine cycle length. 



Table 4.2. ADS# Initiated Bus Cycle Definitions 


M/IO# 


D/C# 


W/R# 


Bus Cycle Initiated 











Interrupt Acknowledge 








1 


Special Cycle 





1 





I/O Read 





1 


1 


I/O Write 


1 








Code Read 


1 





1 


Reserved 


1 


1 





Memory Read 


1 


1 


1 


Memory Write 



Table 4.3. Memory Data Transfer Cycle Types 



PCYC 


CTYP 


W/R# 


Data Transfer Type 











Normal read 





1 





Pipelined load (pfld instruction) 


1 








Page directory read 


1 


1 





Page table read 








1 


Write-through (S-state hit) 





1 


1 


Store miss or write-back 


1 





1 


Page directory update 


1 


1 


1 


Page table update 



NOTE: 

PCYC and CTYP are defined only for memory data transfer 
cycles (D/C# = 1,M/IO.# = 1) 









A K 


DATA 


ADDRESS 


A31-A3 > 


A K 


^ V 


BE7#-BE0# > 


V 

PCHK# 


PARITY 


<DP7-DP0) 
PEN# 


I ADS# 


BRDY# 


CYCLE 
CONTROL 


" LEN 


• CACHE# T 


NA# 


5 LOCK# ' 


KEN# 


\ NENE# 


I PWT 


CACHE 
CONTROL 


WB/WT# 


i PCD 


AHOLD 


{ HIT# 


CACHE 
CONSISTENCY 


EADS# 


HITM# " 


INV 


KBO 


FUNE# ? 


KB1 I 


HOLD 


BREQ 


BUS 
ARBITRATION 


BOFF# 


HLDA : 


INT/CS8 


M/lO# 


INTERRUPT 


CYCLE 
DEFINITION 


D/C# 


BERR 


w/r# ; 


TCK 


k PCYC 


^ CTYP 


I TD0 b 


BOUNDARY 
SCAN 


TDI 


TMS 




TRST# 


RESET 




clk : 


EWBE# * 


BYPASS# 






immmmmmmmm+mm 






240874-27 



Figure 4.1. Signal Grouping 



Table 4.4. Cycle Length Definition 



W/R# 


LEN 


CACHE # 


KEN# 


Cycle Description 


Burst Length 








1 


— 


Noncacheable** 64-bit (or less) read 


1 








— 


1 


Noncacheable 64-bit (or less) read 


1 


1 





1 


■ •— 


64-bit (or less) write 


1 


— 





1 


— 


I/O and Special Cycles 


1 





1 


1 


— 


Noncacheable 128-bit read (p)fld.q 


2 





1 


— 


1 


Noncacheable 1 28-bit read (p)f Id.q 


2 


1 


1 


1 


— 


128-bit write fst.q 


2 





— 








Cache line fill 


4 


1 


— 





— 


Cache write-back 


4 



NOTE: 

** Includes CS8-mode code fetches, which may be cached by the processor. 
—Indicates "don't care" values. 
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4.2 Signal Description 

In this section descriptions of all pins are presented 
in alphabetical order, 



4.2.1 A31-A3 (ADDRESS PINS) 

The 29-bit address bus (A31 -A3) identifies address- 
es to a 64-bit location. Separate byte-enable signals 
(BE7#-BE0#) identify which bytes should be ac- 
cessed within the 64-bit location. 

The address lines are bidirectional. The i860 XP mi- 
croprocessor drives the address lines unless it is in a 
hold state. The system drives address lines A31 -A5 
to perform cache line inquiries (refer to the EADS# 
signal description). 



4.2.2 ADS# (ADDRESS STATUS) 

The i860 XP microprocessor asserts ADS # to iden- 
tify the first clock period of each bus cycle/the clock 
period during which new values become valid on the 
address bus and cycle-definition pins. This signal is 
held active for one clock. 

If BOFF# is asserted, the processor floats ADS# 
two clocks after sampling BOFF# (and not, like all 
other pins, on the next clock). This is to ensure that 
ADS# is deasserted before it floats, and therefore is 
never left floating active. 

ADS# can be asserted while AHOLD is active to 
initiate a cache write-back cycle. 



4.2.3 AHOLD (ADDRESS HOLD) 

The external system asserts AHOLD to perform a 
cache inquiry. In response to assertion of AHOLD, 
the i860 XP microprocessor immediately (in the next 
clock) stops driving the address, bus (A3 1 -A3 lines). 
The other buses remain active, and data can be 
transferred for previously issued read or write bus 
cycles during address hold. AHOLD is recognized 
even during RESET and LOCK#. The earliest that 
AHOLD can be deasserted is the clock after EADS# 
is asserted to start the inquiry. 

If HITM# has activated due to an inquiry, the 
i860 XP microprocessor asserts ADS# while 
AHOLD is active to start the write-back of the modi- 
fied line that was the target of the inquiry. 



4.2.4 BE7#-BE0# (BYTE ENABLES) 

The byte-enable pins are driven with the address. 
BE7# applies to D63-D56, BE0# applies to D7- 
DO. 



In write cycles (noncacheable writes as well as 
cache line write-backs), the BEa7# signals determine 
which bytes must be written into external memory 
for the current cycle. 

In read cycles, the BEn# values indicate which byte 
the load instruction has requested. In all noncachea- 
ble read cycles (CACHE # or KEN# deasserted), 
the byte enables match the length and address of 
the requested data. Cacheable read cycles (KEN# 
asserted), however, result in four 64-bit memory 
transfers to fill an entire 32-byte cache line. The 
BEn# pins activated are those that represent the 
operand of the load instruction that caused the line 
fill, and these same BEa7# pins remain activated for 
as long as A31 -A5. All 64 bits must be returned for 
each cacheable cycle without regard for the BE/7# 
signals. 

While in CS8 mode, BE2#-BE0# serve as (active- 
high) lower-order address bits for instruction fetches 
(from the ROM). Data fetches and stores are not 
affected by CS8 mode, and BE2#-BE0# retain 
their normal byte-enable function for data. 

4.2.5 BERR (BUS ERROR) 

This is a nonmaskable interrupt input, which sup- 
ports bus error handling or other urgent circum- 
stances. BERR is not masked by the IM bit of the 
psr nor by lock cycles. When BERR is activated, the 
i860 XP microprocessor vectors to the trap handler 
and sets the bus error flag (BEF) in the epsr. BERR 
causes the physical address of the current bus cycle 
to be latched into the BEAR control register; thus, if 
asserted the clock of BRDY# or the clock after 
BRDY#, it causes the bus address to be latched for 
software to examine. BERR is rising-edge sensitive. 
Once the trap has occurred, further BEF traps can- 
not occur until software has cleared BEF and read 
BEAR. 

BERR does not terminate outstanding bus cycles. 
Therefore, the system must still activate BRDY# a 
sufficient number of times or activate, BOFF# for 
those cycles. Even though activating BOFF# tem- 
porarily halts the erring cycles, the i860 XP micro- 
processor will retry them when BOFF# is deassert- 
ed, in spite of BERR. 

Timing of BERR is not influenced by late back-off 
mode. 



4.2.6 BOFF# (BACK-OFF) 

The system can assert this signal to abort all out- 
standing bus cycles that have not yet completed. In 
response to BOFF#, the i860 XP microprocessor 
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immediately (in the next clock) floats its bus, except 
for ADS#, which is floated one clock later. The 
processor floats all the same pins normally floated 
during bus hold; however, unlike a bus hold, HLDA is 
not asserted. (HLDA is asserted only in response to 
HOLD; no acknowledgment is required for BOFF#.). 
Any data and BRDY# returned to the processor 
while BOFF# is asserted are ignored. The proces- 
sor remains in bus hold until BOFF# is deasserted, 
at which time it restarts the bus cycles by driving the 
address and cycle definition pins and asserting 
ADS#. When BOFF# deactivates, ADS# may be 
asserted the following clock. Thus a BOFF# dura- 
tion of one clock results in not floating ADS# at all. 
BOFF# cannot be used to force the pins to float 
during RESET; use HOLD for that purpose. 

4.2.7 BRDY# (BURST READY) 

The input BRDY# indicates either that the external 
system has driven valid data on the data pins in re- 
sponse to a read request or that the external system 
has latched the data in response to a write request. 
The CPU ignores this signal when no bus requests 
are outstanding. During a bus cycle, BRDY# is sam- 
pled at each clock, starting with the clock after as- 
sertion of ADS # and continuing until all data for the 
cycle has been transferred. When BRDY# is sam- 
pled active in a read cycle, the data present on the 
pins is sampled. 

4.2.8 BREQ (BUS REQUEST) 

BREQ allows the i860 XP microprocessor to share 
the local bus with other bus masters. An external 
bus arbiter can use BREQ to implement an "on de- 
mand only" policy for granting the bus to the i860 XP 
microprocessor. The i860 XP microprocessor as- 
serts BREQ the clock after it realizes an internal re- 
quest for the bus. The system should sample this pin 
only when the i860 XP microprocessor is not in con- 
trol of the bus (that is/when HLDA, BOFF#, or 
AHOLD is active). BREQ is undefined when the 
i860 XP microprocessor is driving the bus. BREQ 
may be deasserted between assertions of ADS#, 
but this does not imply that the CPU does not need 
the bus. 



4.2.9 BYPASS # (BYPASS) 

This pin is reserved by Intel Corporation and should 
be tied HIGH to V C c through a resistor. When LOW, 
the phase-locked loop that generates the internal 
clock is unused. In this case, the internal clock has 
more skew relative to the external CLK, and the A.C. 
timing parameters are not guaranteed. 



4.2.10 CACHE# (CACHEABILITY) 

This output signal indicates internal cacheability of a 
bus request. Its timing follows that of the address 
bus. 

The i860 XP microprocessor asserts CACHE # for 
cacheable reads and code fetches to announce its 
intention to cache the data. If CACHE # is asserted 
on a read cycle and if the KEN# input is active, the 
cycle is a burst line fill. If CACHE # is inactive in a 
read cycle, the i860 XP microprocessor does not 
cache the returned data, regardless of the KEN# 
pin. CACHE # is also asserted for cache line write- 
backs. 

CACHE # is inactive for noncacheable reads (for ex- 
ample, pfld, Idio, Idint), TLB replacements, and 
store misses. 

Table 4.4 shows how cacheability determines the 
number of data transfers in a cycle. 

Note that the CACHE # output is always inactive for 
CS8 (Code-Size 8 bits) mode instruction fetches so 
that the instructions are fetched with single-transfer 
cycles. However, the code fetched may then be 
placed in the instruction cache, unless KEN# was 
inactive. 



4.2.11 CLK (CLOCK) 

The CLK input determines execution rate and timing 
of the i860 XP microprocessor. External timing pa- 
rameters are specified relative to the rising edge of 
this signal. The i860 XP microprocessor can utilize a 
clock rate of 50 Mhz. The internal operating frequen- 
cy is the same as the external clock. This signal re- 
quires TTL levels. 
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4.2.12 CTYP (CYCLE TYPE) 

CTYP is one of the bus cycle definition signals. Ta- 
bles 4.2 and 4.3 show the types of bus cycle gener- 
ated. CTYP is defined only for data write and read 
requests. The value of this pin changes only when 
ADS# is asserted. 



4.2.13 D/C# (DATA/CODE) 

D/C# specifies whether the current request is for 
data or instructions. The data/code line is one of the 
bus cycle definition pins. Tables 4.2 and 4.3 show 
the types of bus cycle generated. The value of this 
pin changes only when ADS# is asserted. 



4.2.14 D63-D0 (DATA PINS) 

The bus interface has 64 bidirectional data pins 
(D63-D0) to transfer data in eight- to 64-bit quanti- 
ties. Pins D7-D0 transfer the least significant byte; 
pins D63-D56 transfer the most significant byte. In 
read cycles, all 64 bits of the data bus are latched, 
even in CS8-mode instruction fetches when only the 
low-order eight bits are used. In write cycles, the 
i860 XP microprocessor does not drive D63- DO in 
the clock of ADS # , but in the following clock. 

4.2.15 DP7-DP0 (DATA PARITY) 

There is one parity signal for each byte of the data 
bus. They are driven by the i860 XP microprocessor 
with even parity information on writes with the same 
timing as write data. Likewise, if parity checking is 
enabled by PEN#, the system must drive even pari- 
ty information on these pins with the same timing as 
read information to ensure that the correct parity 
check status is indicated by the i860 XP microproc- 
essor. "Even parity" means that the total number of 
set bits in a byte, including the parity bit, is even. 
Refer also to the PCHK# signal. 

4.2.16 EADS# (EXTERNAL ADDRESS STATUS) 

This signal indicates that a valid external address 
has been driven onto address pins A31 -A5 of the 
i860 XP microprocessor to be used for a cache in- 
quiry. This signal is recognized while the processor 
is in hold (HLDA is driven active), while forced off the 
bus with BOFF# input, or while AHOLD is asserted. 
The i860 XP microprocessor ignores EADS# at all 
other times. EADS# is not recognized if HITM# is 
active, nor during the clock after ADS#, nor during 
the clock after a valid assertion of EADS #. Table 
4.5 shows when EADS is first sampled. It is then 
sampled in every clock as long as the hold remains 
active and HITM# remains inactive. 



Table 4.5. EADS# Sample Time 



Trigger 


E ADS # First Sampled 


AHOLD 
HOLD 
BOFF# 


Second clock after AHOLD asserted 
First clock after HLDA asserted 
Second clock after BOFF# asserted 



INV and FLINE# are sampled in the same clock pe- 
riod that EADS# is validly asserted. HIT# and 
HITM# may be asserted as the results of a cache 
inquiry. 

4.2.17 EWBE# (EXTERNAL WRITE BUFFER 
EMPTY) 

At RESET, the value on EWBE# determines the or- 
dering mode. The processor enters strong ordering 
mode if EWBE# is sampled active for at least the 
last three clocks before RESET deactivates; other- 
wise, it enters weak ordering mode. 

In weak ordering mode, the value of EWBE# after 
reset does not affect processor operation. 

In strong ordering mode, the external system asserts 
EWBE# as long as all external write buffers are 
empty. If an external write buffer is not empty 
(EWBE# deasserted) or the internal write buffer is 
not empty, the processor delays data cache updates 
so as to keep the external order of writes the same 
as the programmed order. 

In systems that do not have external write buffers, 
EWBE# can be tied to Vss» if strong ordering is de- 
sired, or to V<x, if weak ordering is acceptable. Re- 
fer to sections 5.3.3 and 5.3.4 for more explanation 
and for other ways to control write ordering. 

4.2.18 FLINE# (FLUSH LINE) 

The system asserts FLINE# to request that the 
i860 XP microprocessor write back a modified cache 
line before other outstanding bus cycles are com- 
pleted, if the line is hit by an external inquiry. If this 
pin is active in the same clock that EADS # is assert- 
ed, the write-back cycle is initiated, and the i860 XP 
microprocessor expects BRDY#s for the write-back 
before outstanding cycles (if any) are returned. If 
data transfer for another cycle is currently in prog- 
ress when FLINE# is asserted (i.e. first BRDY# re- 
turned before HITM# asserted), the i860 XP micro- 
processor waits until the data transfers for that burst 
have completed, and only then does it assert the 
ADS# for the write-back. If the first BRDY# has not 
yet occurred for an outstanding cycle, NA# must be 
activated to trigger ADS# for the write-back. 
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At RESET, the value on FLINE# determines config- 
uration. The processor enters one-clock late back- 
off mode if FLINE# is sampled active for at least the 
last three clocks before RESET deactivates. 



4.2.19 HIT# (CACHE INQUIRY HIT) 

This pin is one output of inquiry cycles. If an inquiry 
cycle hits a valid line in the caches of the i860 XP 
microprocessor (either data or instruction), HIT# is 
asserted two clocks after EADS# is activated. If the 
inquiry cycle misses the caches, this pin is negated 
two clocks after EADS# activation. 

This pin changes its value only as a result of EADS# 
activation during AHOLD, HOLD, or BOFF# and re- 
tains its value until two clocks after the next valid 
activation of EADS#. 

HIT# can be used to control the WB/WT# pin of 
other processors in a multiprocessor system. Activa- 
tion of HIT# indicates that the inquiring processors 
should cache the line as S-state, not E-state. 



4.2.20 HITM# (HIT MODIFIED LINE) 

This pin is an output of inquiry cycles. When an in- 
quiry hits a modified line in the internal data cache, 
the i860 XP microprocessor asserts HITM# two 
clocks after EADS# is activated. (Refer also to the 
EADS# signal.) The HITM# signal stays active until 
the last BRDY# for the corresponding write-back 
cycle. At all other times, HITM# is inactive. HIT# is 
also asserted when HITM# is asserted (except for 
the special case of an inquiry after the ADS# of a 
write-back). 

4.2.21 HLDA (BUS HOLD ACKNOWLEDGE) 

The i860 XP microprocessor activates HLDA in re- 
sponse to a hold request presented on the HOLD 
pin. Assertion of HLDA indicates that the i860 XP 
microprocessor has given the bus to another local 
bus master. It is driven active in the same clock that 
the i860 XP microprocessor floats its bus. All output 
pins are floated except LOCK#, BREQ, HLDA, 
PCHK#, HIT#, and HITM#. 

The time required to acknowledge a hold request is 
one clock plus the number of clocks needed to finish 
any outstanding bus cycles (maximum of four out- 
standing cycles of four burst transfers each for total 
of 16 transfers). If this hold latency is too long for a 
given application, BOFF# can be used instead. 

When leaving a bus hold, the i860 XP microproces- 
sor deactivates HLDA and, in the same clock period, 
initiates a pending bus cycle, if any. 



4.2.22 HOLD (BUS HOLD) 

This pin, along with the output signal HLDA, is used 
for local bus arbitration. At some time after the 
HOLD signal is asserted, the i860 XP microproces- 
sor releases control of the local bus and puts most 
bus interface outputs in floating state, then asserts 
HLDA — all during the same clock period. It main- 
tains this state until HOLD is deasserted. Instruction 
execution stops only if required instructions or data 
cannot be read from the on-chip instruction and data 
caches. The i860 XP microprocessor ignores HOLD 
until all outstanding bus cycles are complete (until 
the last BRDY#). The i860 XP microprocessor rec- 
ognizes HOLD even during RESET and LOCK#. 
HOLD cannot be used when the 82495XP cache 
controller is attached. 



4.2.23 INV (INVALIDATE) . 

The external system asserts this signal to invalidate 
the cache-line state in the case of an inquiry cycle 
hit. It is sampled together with A31-A5 in the clock 
EADS# is active. 



4.2.24 INT/CS8 (INTERRUPT/CODE-SIZE 
EIGHT BITS) 

This input, like the BERR input, allows interruption of 
the current instruction stream. The processor sam- 
ples INT as instruction boundaries. If interrupts are 
enabled (IM set in psr) when INT is sampled active, 
the i860 XP microprocessor fetches the next instruc- 
tion from virtual address OxFFFFFFOO. INT is level 
triggered. To assure that an interrupt is recognized, 
INT should remain asserted until the software ac- 
knowledges the interrupt (by executing an interrupt- 
acknowledge cycle, for example). The interrupt may 
be ignored by the processor if the INT signal does 
not remain active. 

Interrupt latency (the maximum time between asser- 
tion of INT and execution of the first instruction of 
the trap handler) depends both on the internal con- 
text and on the external system. After INT is assert- 
ed, the i860 XP microprocessor finishes all instruc- 
tions currently being executed, including any out- 
standing bus cycles, before starting the trap handler. 
The following instruction sequence is an example of 
the worst case: 

pfld.q 

pfld.q . 

ld.l 

br 

ld.l 

st.l 
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If INT is asserted during the execution stage of the 
last Id.l instruction, the execution of the trap handler 
may have to wait for: 

• Two 2-transfer bursts (the pfld instructions) 

• Two data cache line fills (misses by the Id.l 
instructions) 

• Two data cache line write-backs (eliminating 
modified lines to open space for the fills) 

• Two instruction cache line fills (the target of the 
br and the first instruction of the trap handler) 

• Three TLB miss sequences of up to six nonpipe- 
lined accesses each (the br, the last Id.l, and the 
trap handler) 

The time to finish the above bus activities can be 
extended by inquiry cycles and associated write- 
backs initiated by an external cache or bus control- 
ler. 

Besides the bus-related delays, the i860 XP micro- 
processor has internal freeze conditions that can de- 
lay interrupt response by up to 10 additional clocks. 

During a locked sequence, the INT pin is ignored, 
and the INT bit of epsr reflects the value on the INT 
pin. To limit the time that INT is ignored, the lock 
instruction can assert LOCK# for only 30-33 in- 
structions before trapping. 

This input is asynchronous, but appropriate setup 
and hold times must be met to insure recognition on 
any specific clock. 

If INT is asserted for at least the last three clock 
periods before the falling edge of RESET, the 
i860 XP microprocessor enters eight-bit code-size 
(CS8) mode. 

4.2.25 KBO, KB1 (CACHE BLOCK) 

For reads, these output signals define which cache 
block (line) is going to receive the data. For write- 
backs, these lines specify which block is being 
flushed. They are driven together with cycle defini- 
tion for cacheable data reads, TLB replacement, 
code fetch cycles, and write-backs. External hard- 
ware can use these signals to observe changes to 
cache blocks. 



4.2.26 KEN# (CACHE ENABLE) 

The i860 XP microprocessor samples KEN # to de- 
termine whether the data being read for the current 
cache-miss cycle is to be cached. When the i860 XP 



microprocessor generates a read cycle that can be 
cached (CACHE # output active) and KEN# is ac- 
tive, the cycle is transformed into a burst line fill. By 
activating KEN#, the memory system commits to a 
four-transfer burst. The entire 64 bits of the data bus 
are used for the read, regardless of the state of the 
byte-enable pins. 

If KEN# is sampled inactive, code fetches are not 
transferred in bursts, but 128-bit data items may still 
be transferred with a burst length of two. 

KEN# is sampled together with NA# or BRDY#, 
whichever comes first. It is sampled only with the 
first BRDY# of a burst; its value at any other time 
has no effect. 



4.2.27 LEN (DATA LENGTH) 

The LEN output pin specifies the number of burst 
transfers for each cycle. This pin and the CACHE # 
output pin are used by the system to determine the 
burst length for each cycle (refer to Table 4.4). The 
i860 XP microprocessor can generate 1, 2, or 4- 
transfer bursts for reads and writes. 

LEN is inactive if the internal request is for 64 bits or 
less. If LEN is active, the internal request is for 128 
bits or more, and the cycle should be returned as a 
two- or four-transfer burst. LEN is always active for 
128-bit data accesses. LEN is always inactive for 
code accesses. 

A cacheable read (CACHE # active) can be auto- 
matically converted to a four-transfer burst regard- 
less of LEN by assertion of KEN#. 

Table 4.4 summarizes different cycle lengths as they 
are calculated from the LEN and CACHE # signals. 
LEN has the same timing as the address. 



4.2.28 LOCK # (ADDRESS LOCK) 

This signal is used to provide atomic (indivisible) 
read-modify-write sequences in multiprocessor sys- 
tems. The address to be locked is the one being 
driven on A31 -A3 when LOCK # is activated. A mul- 
tiprocessor bus arbiter must permit only one proces- 
sor a locked read, locked write, or unlocked write to 
that address and must maintain the lock of that loca- 
tion across cycle boundaries until LOCK# deacti- 
vates. The simplest arbitration hardware can just 
lock the entire bus against all other accesses during 
LOCK# assertion; however, software must never 
assume that this implementation is being used. 
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The i860 XP microprocessor coordinates the exter- 
nal LOCK# signal with the lock and unlock 
instructions. Programmers do not have to be con- 
cerned about the fact that bus activity is not always 
synchronous with instruction execution. LOCK# is 
asserted with ADS# for the address operand of the 
first load or store instruction executed after the lock 
instruction. 

After an unlock instruction, LOCK# is deasserted 
with the next load or store. The i860 XP microproc- 
essor deactivates LOCK# one clock after ADS# for 
the last locked bus cycle. Unlike the i860 XR micro- 
processor, the i860 XP microprocessor does not 
deassert LOCK# immediately when a trap occurs. 
Instead, the trap handler must execute a load or 
store instruction to deassert LOCK#. (The handler 
does not have to execute an unlock instruction, 
however. The unlocking function is performed by the 
processor's trap logic.) 

The i860 XP microprocessor also asserts LOCK# 
during TLB miss processing for updates of the ac- 
cessed bit in page-directory and page-table entries. 
The maximum time that LOCK# can be asserted in 
this case is the time required to perform a nonpipe- 
lined, four-byte, read-modify-write sequence. 

Between locked sequences, at least one cycle of no 
LOCK# is guaranteed by the behavior of the unlock 
instruction. 

Between lock and unlock instructions, the INT pin is 
ignored. 

Instruction fetches do not alter the LOCK# signal. 



4.2.29 M/IO# (MEMORY-I/O) 

M/IO# specifies whether the current cycle is for the 
memory address space or for the I/O address 
space. M/IO# is one of the bus cycle definition pins. 
Tables 4.2 and 4.3 show the types of bus cycle gen- 
erated. The value of this pin changes only when 
ADS# is asserted. 



4.2.30 NA# (NEXT ADDRESS REQUEST) 

NA# makes address pipelining possible. The sys- 
tem asserts NA# for at least one clock to indicate 
that it is ready to accept the next address from the 
i860 XP microprocessor. (If the system does not im- 
plement pipelining, NA# must not be activated.) The 
i860 XP microprocessor samples NA# every clock, 
starting one clock after the activation of ADS#. If 
the i860 XP microprocessor has a new cycle pend- 
ing internally when NA# is activated, it initiates that 
cycle in the clock after NA# is asserted. Up to three 
bus cycles can be outstanding simultaneously. 



NA# is latched internally; the i860 XP microproces- 
sor remembers that NA# was asserted until it has 
an internal request to send to the bus; so, assertion 
of NA# for a single clock can trigger an ADS# sev- 
eral clocks later. NA# is ignored in the clock of 
ADS#. 

KEN# and WB/WT# inputs for the current cycle 
are sampled with NA#, if NA# is asserted before 
the first BRDY# of the current cycle. 

NA# is also used in conjunction with FLINE# to 
invoke write-back of a modified line during outstand- 
ing bus cycles. 

4.2.31 NENE# (NEXT NEAR) 

The i860 XP microprocessor asserts NENE# when 
the current address is in the same DRAM page as 
the previous bus cycle. This signal allows higher- 
speed reads and writes in the case of consecutive 
accesses to static column or page-mode DRAMs. 
The i860 XP microprocessor determines the DRAM 
page size by inspecting the software-controlled DPS 
field in the dirbase register. The page size can 
range from 2 9 to 2 16 64-bit words, supporting DRAM 
sizes from 256K x 1 to 4G x n. The value of this 
pin changes only when ADS# is asserted. NENE# 
is never asserted for the next bus cycle after the 
address bus has been floating (after AHOLD, 
BOFF#, or HLDA is deasserted). 

4.2.32 PCD (PAGE CACHE DISABLE) 

PCD provides a cacheability indication on a page by 
page basis. This signal, together with PWT, is set to 
an attribute bit in the page table entry for the current 
cycle. When paging is enabled, PCD corresponds to 
the CD bit (bit 4) of the page table entry. The i860 XP 
microprocessor does not perform a cache fill to any 
page for which CD of the page table entry is set. 
When paging is disabled/or for any cycle that is not 
paged (Idio, stio, Idint, scyc), the i860 XP micro- 
processor drives PCD inactive. 

During TLB miss processing, PCD is inactive while 
the address translation hardware is accessing the 
first level page directory. During accesses to the 
second-level page-table entry, PCD reflects the CD 
values taken from the first level page-table entry. 

The value of this pin changes only when ADS# is 
asserted. 



4.2.33 PCHK# (PARITY CHECK) 

This output shows the result of the parity check on 
data pins in the previous clock of a read cycle. It is 
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asserted for one clock when incorrect parity has 
been detected. It reflects the parity status for the 
entire data bus. 

PCHK# does not terminate outstanding bus cycles, 
so the system must still activate BRDY# a sufficient 
number of times or activate BOFF# for those cy- 
cles. PCHK# is always inactive after any code fetch 
in CS8 mode. 

4.2.34 PCYC (PAGE CYCLE) 

The page cycle line is active during memory read or 
write cycles to distinguish page-table accesses from 
other accesses. The types of bus cycle generated 
are indicated in Tables 4.2 and 4.3. The value of this 
pin changes only when ADS # is asserted. 

4.2.35 PEN# (PARITY ENABLE) 

The i860 XP microprocessor samples this signal for 
read cycles on the same clock edge at which 
BRDY# is found asserted. If sampled active, the 
i860 XP microprocessor feeds the parity check re- 
sult into the interrupt logic. If a parity error is encoun-. 
tered, the i860 XP microprocessor vectors to the 
trap handler. The BEAR register latches the offend- 
ing address, as described with the BERR signal. 
This interrupt is not masked by the IM bit of the PSR, 
nor is it masked during lock cycles. 

The system should deassert PEN# any time the 
DP7-DP0 pins are known not to reflect the parity of 
the full eight-byte bus (for example, reads from I/O 
devices or ROMs that are not parity protected). 

The system should deassert PEN# during code 
fetches in CS8 mode. 

At RESET, the value of PEN# determines the out- 
put buffers configuration for ADS#, A21-A3, 
BE7#-BE0#, W/R#, HITM#. These pins are con- 
figured as normal (small output buffers) mode if 
PEN# is sampled active for at least the last three 
clocks before RESET deactivates. Otherwise, these 
pins are configured as high-current mode (large out- 
put buffers). 

4.2.36 PWT (PAGE WRITE-THROUGH) 

PWT provides a write-back/write-through indication 
on a page by page basis. This signal, together with 
PCD, is set to an attribute bit in the page table entry 
for the current cycle. When paging is enabled, PWT 
corresponds to the WT bit (bit 3), and write-back 
caching is implemented for this page only if WT is 
clear. When paging is disabled, or for any cycle that 
is not paged (Idio, stio, Idint, scyc), the i860 XP 
microprocessor drives PWT inactive. 

During TLB miss processing, PWT is inactive while 
the address translation hardware is accessing the 



first level page directory. During accesses to the 
second-level page-table entry, PWT reflects the WT 
value taken from the first level page-table entry. 

The value of this pin changes only when ADS# is 
asserted. 

4.2.37 RESET (SYSTEM RESET) 

Asserting RESET for at least ten CLK periods caus- 
es initialization of the i860 XP microprocessor. On 
power up, RESET should remain active at least one 
millisecond after Vcc and CLK have reached their 
proper DC and AC specs. RESET is synchronous 
with CLK. 

After the RESET signal goes inactive the processor 
remains in the RESET state for three more clocks. 
Applications that use the HOLD signal to float the 
bus during RESET should keep HOLD active for 
three more clocks after the RESET signal is deacti- 
vated. 

4.2.38 RSRVD, SPARE 

The RSRVD input is reserved by Intel Corporation 
and must be tied HIGH to Vcc through a resistor 
(5 Kffc). The spare input should be left unconnected. 

4.2.39 TCK (TEST CLOCK) 

This is the clock input for the TAP (test access port). 
If the TAP is to be used, this signal must be connect- 
ed to a clock synchronous to CLK. If the TAP is not 
used, TCK can be tied low. TCK does not need to be 
kept running when boundary scan is not active. 

The rising edge of TCK must be externally synchro- 
nized to CLK. The boundary scan latches retain their 
state when TCK is stopped at either logic zero or 
one. 

4.2.40 TDI (TEST DATA INPUT) 

TDI is the input for test instructions and data to the 
TAP. TDI is sampled on the rising edge of TCK. It is 
provided with an internal pull-up resistor, so that an 
open circuit at TDI produces a result equivalent to 
driving continuous HIGH signals. 

4.2.41 TDO (TEST DATA OUTPUT) 

This is the serial output of the TAP. The contents of 
TAP registers are shifted out through TDO on the 
falling edge of TCK. The data is moved from TDI to 
TDO without inversion, which allows easy serial cas- 
cading of different components for scanning. 

TDO is held in high-impedance state, except while 
scanning is in progress. This allows parallel connec- 
tion of these outputs for several components. 
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4.2.42 TMS (TEST MODE SELECT) 

This input is decoded by the TAP to select the oper- 
ation of the TAP. It is sampled at the rising edge of 
TCK. It is provided with an internal pull-up resistor to 
assure deterministic behavior for open-circuit failure 
at this pin. If boundary scan is not used, TMS can be 
tied high or left unconnected. 



4.2.43 TRST# (TEST RESET) 

This input resets the TAP. If the TAP is not used, 
TRST# should be tied LOW. To ensure determinist- 
ic behavior of the test logic, TMS should be held 
HIGH while TRST# changes from LOW to HIGH. 



4.2.44 V C c (SYSTEM POWER) AND V S s 
(GROUND) 

The i860 XP microprocessor has 54 pins for power 
and 56 for ground. All pins must be connected to the 
appropriate low-inductance power and ground sig- 
nals in the system. 

4.2.45 VccCLK (CLOCK POWER) 

This is the power supply for the internal CLK buffer. 
It should be connected to the same Vcg plane as 
the other Vcc pins. 

4.2.46 WB/WT# (WRITE-BACK/WRITE- 
THROUGH) 

This input signal defines cache policy for the line 
being accessed in the current bus cycle. The proc- 
essor samples WB/WT# for both reads and writes 
on the same clock edge at which it finds NA# or the 
first BRDY# asserted, whichever comes first. If this 
signal is sampled low, the write-through policy is ap- 



plied to the cache line — if an internal write hits this 
line, it causes a write-through cycle. If this signal is 
sampled high, the write-back policy is applied — fu- 
ture write hits to this line do not show up on the bus. 

4.2.47 W/R# (WRITE/READ) 

This pin specifies whether a bus cycle is a read 
(LOW) or write (HIGH) cycle. Tables 4.2 and 4.3 
show the types of bus cycle generated. The value of 
this pin changes only when ADS# is asserted. 



5.0 BUS OPERATION 

The interaction among signals is illustrated by timing 
diagrams. Figure 5.1 shows the conventions used in 
the timing diagrams. 



5,1 Bus Cycles 

A bus cycle begins when the i860 XP microproces- 
sor activates ADS# and ends when the system acti- 
vates the last of a predetermined number of BRDY# 
signals. Figure 4.4 shows how the i860 XP micro- 
processor and the external system cooperate to de- 
termine the number of BRDY# activations in each 
cycle. The processor starts sampling BRDY# one 
clock after assertion of ADS# and continues sam- 
pling in every clock until the last BRDY# becomes 
active. 

The i860 XP microprocessor supports several differ- 
ent types of bus cycle. These are introduced in order 
of complexity: 

1 . Single-transfer cycles 

2. Multiple-transfer (burst) cycles 

3. Pipelined cycles 

4. Cache inquiry cycles 



SIGNAL ID i 



NOTES: 

1. HIGH (high voltage) 

2. Don't care or undefined 

3. LOW (low voltage) 

4. High-impedance (floating) 

5. Either HIGH or LOW 




240874-28 
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Figure 5.1. Timing Diagram Conventions 
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5.1.1 SINGLE-TRANSFER CYCLE 

The simplest bus cycle is the single-transfer, non- 
cacheable, 64-bit cycle either with or without wait 
states. The shortest bus cycle is two clock periods 
long. Read and write cycles of this type are shown in 
Figure 5.2. 

A wait state is any clock in which the i860 XP micro- 
processor samples BRDY# but the system does hot 
assert it. The system can add wait states to any cy- 
cle. Figure 5.3 shows cycles with two wait states 
added. Any number of wait states can be added to 
i860 XP microprocessor bus cycles by maintaining 
BRDY# inactive. 



5.1.2 BURST CYCLES 

When a bus request requires more than a single 
data transfer (refer to Table 4.4), the i860 XP micro- 
processor requires that the memory system perform 
a burst data transfer. Burst cycles allow the maxi- 
mum bus transfer rate by eliminating unnecessary 
driving of the address bus. The addresses of the 
data items in burst cycles all fall within the same 32- 
byte aligned area (corresponding to an internal 
i860 XP microprocessor cache line). Given the ad- 
dress of the first transfer, external hardware can cal- 
culate the addresses of subsequent transfers. With 
these addresses eliminated from the bus, a new 
data item can be sampled into the i860 XP micro- 
processor every clock period. 
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Figure 5.2. Fastest Single-Transfer Cycles 
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Figure 5.3. Single-Transfer Cycles with Wait States 



The fastest possible burst cycle requires two clock 
periods for the first data item: one clock for ADS # 
and one clock for BRDY#; subsequent data items 
are transferred every clock period. One such bus 
cycle is shown in Figure 5.4. Note that, in this case, 
the initial cycle generated by the i860 XP microproc- 
essor could be satisfied by a single data transfer, but 
the system transforms it into a multiple-transfer 
cache line fill by activating KEN# in the clock period 
of the first BRDY#. KEN# has this effect only if the 
CACHE # pin is active, which means the cycle is in- 
ternally cacheable in the i860 XP microprocessor. 



Read data is sampled only in the clock period in 
which BRDY# is returned, which means that data 
need not be sent to the i860 XP microprocessor ev- 
ery clock period in the burst cycle. Figure 5.5 shows 
an example of a burst cycle in which two clock peri- 
ods are required for every burst item. 

The burst length attributes LEN and CACHE # are 
driven with the address. Figure 5.6 illustrates two 
consecutive burst cycles with differing length attri- 
butes: the first one is a noncacheable 128-bit read, 
and the second one is a cache line fill initiated by a 
cacheable 64-bit read. 
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Figure 5.4. Basic Burst Cycle 
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Figure 5.5. Slow Burst Cycle 
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Figure 5.6. Different Lengths of Burst Cycles 



The timing of write bursts is similar to that of read 
bursts. The i860 XP microprocessor does not put 
data on D63-D0 for writes until the clock period af- 
ter ADS #. 

When initiating any read, the i860 XP microproces- 
sor presents the address for the data item request- 
ed. When the cycle is converted into a cache fill, the 
first data item returned corresponds to the address 
sent out by the i860 XP microprocessor. The remain- 
ing items must be returned in the order shown in 
Table 5.1. This ordering is optimized for two-bank 
memories, but works equally well with noninter- 
leaved memories. 

In i860 XP microprocessor systems, memory must 
support the burst order as defined in Table 5.1 for 
reads. For writes, the burst addresses are always 
increasing, so writes with four transfers match the 
first line of the table. In CS8 (code-size 8 bits) mode, 
instructions are not fetched in bursts. 

Note that the i860 XP microprocessor drives only 
the first address of a burst cycle; the memory sys- 
tem is responsible for calculating subsequent ad- 
dresses as shown in the table. The addresses can 
be derived by complementing A3 after every trans- 
fer, and complementing A4 after two transfers. 



Table 5.1. Burst Order for Cache Line Transfers 



1st 
Address 


2nd 
Address 


3rd 
Address 


4th 
Address 




8 
0x10 
0x18 


8 


0x18 
0x10 


0x10 
0x18 



8 


0x18 
0x10 

8 





5.1.3 PIPELINED CYCLES 

A pipelined cycle is one that starts while one or two 
other bus cycles are outstanding. A cycle is consid- 
ered outstanding until the last BRDY# is asserted to 
terminate that cycle. A nonpipelined cycle is one 
that starts when no other bus cycles are outstand- 
ing. Both types of cycle can be either read or write 
cycles. To allow high transfer rates in large memory 
systems, the i860 XP microprocessor supports two- 
level pipelining. New cycles can start as often as 
every other clock until three cycles are outstanding. 

The system asserts NA# to indicate that the 
i860 XP microprocessor can start another cycle be- 
fore the current one is completed. (NA# can even 
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be asserted while BRDY# is active.) The i860 XP 
microprocessor begins sampling NA# in the next 
clock after ADS # is asserted. If the following condi- 
tions are met, a new (pipelined) cycle begins: 

1. NA# having been active 

2. An internal request pending 

3. Compatibility between the pending request and 
the outstanding requests (refer to Table 5.2) 

4. HOLD, BOFF#, and AHOLD not active 

5. Fewer than three cycles outstanding 

The following "compatibility" rules determine when 
the processor does not issue a pipelined ADS# 
(they are the source of Table 5.2): 

• Data cache line fills are pipelined into each other 
only in the case of an aliasing virtual tag miss with 
a physical tag hit. 



• Reads Can be pipelined into TLB miss writes. TLB 
misses for instructions can be pipelined into data 
accesses, and vice versa. 

• No data cycle is ever pipelined while LOCK# is 
active. 

• I/O cycles, special cycles, and Idint cycles never 
begin when any cycle is outstanding. 

NA# may be asserted before, simultaneously with, 
or after the first BRDY# of the current cycle. If NA# 
is asserted before the first BRDY#, the cacheability 
(KEN#) and cache policy (WB/WT#) indicators for 
the current cycle are sampled during the same clock 
period as NA# is sampled active; otherwise, they 
are sampled with the first BRDY#. Figure 5.7 shows 
an example of four-transfer, pipelined, back-to-back 
reads. Note the timing of KEN#. Because NA# is 
asserted before the first BRDY# of the cycle A, 
KEN# is sampled with the NA# for cycle B. 



Table 5.2. Pipeline Cycle Compatibility 





R 


If A is Outstanding, can B be Pipelined into It? 




Data 
Cache 
Line Fill 


Data Cache 
Store Miss, 
Write-Thru 


Data Cache 
Read Miss 
KEN# = 1 


Write- 
Back** 


Instruction 
Fetch 


pfld 


TLB 
Miss 


Idio, stio, 
Idint, scyc 


LOCK# 
Active 


A 


UJ 

o 
> 
o 
</> 

D 

o 

> 

LU 

a 


Data 
Cache 
Line Fill 


YES* 


YES* 


YES* 


YES 


YES 


YES* 


YES 


NO 


YES 


Data Cache 
Store Miss, 
Write-Thru 


YES 


YES 


YES 


YES 


YES 


YES 


YES 


NO 


YES 


Data Cache 
Read Miss 
KEN# = 1 


YES* 


YES* 


YES* 


YES* 


YES 


YES* 


YES 


NO 


YES* 


Write-Back 


YES 


YES 


YES 


NO 


YES 


YES 


YES 


NO 


YES 


Instruction 
Fetch 


YES 


YES 


YES 


YES 


YES 


YES 


YES 


NO 


YES 


pfld 


YES 


YES 


YES 


YES 


YES 


YES 


YES 


NO 


YES 


TLB Miss 


YES 


YES 


YES 


YES 


YES 


YES 


YES 


NO 


YES 


stio 
scyc 


YES 


YES 


YES 


YES 


YES 


YES 


YES 


NO 


YES 


idio 
idint 


NO 


NO 


NO 


NO 


YES 


NO 


YES 


NO 


NO 


LOCK# 
Active 


NO 


NO 


NO 


NO 


YES 


NO 


YES 


NO 


NO 



NOTE: 

* Pipelining can occur if the first ADS# is for an aliasing virtual tag miss with a physical tag hit. 
**lnquiry write-backs are not pipelined into prior cycle unless FLINE# is asserted. 
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NOTES: 

A Four-transfer, cache line fill cycle 
B Four-transfer, cache line fill cycle 
1. KEN# for A simultaneous with NA# 




Figure 5.7. Pipelined Cache Line Fills 



Write cycles can be pipelined into read cycles and 
vice versa, but, in both cases, the processor will 
leave one clock between bursts to allow bus turn- 
over, and will ignore any BRDY# given to it at that 
time. Pipelined back-to-back read and write cycles 
are shown in Figure 5.8. On writes, assertion of NA# 
does not cause the values on the data bus to 
change; it just enables new address and cycle speci- 
fication outputs. 



5.1.4 INTERRUPT ACKNOWLEDGE CYCLES 

In response to a trap caused by assertion of the INT 
pin, trap-handling software can generate interrupt 
acknowledge cycles by executing a procedure simi- 
lar to the following. 



//The following 


lock instruction must be on a 32-byte boundary: 


lock 




// Lock the bus 


ldint.b src2, 


rdest 


// First INTA cycle. Src2 contains 8. 


or rdest 


rO, rdest 


// Won't proceed until rdest loaded. 


unlock 




//Unlock the bus after the next ldint 


//nop 




// Insert 4 + <number of NOPs> idle 


//nop 




// clocks for 8259A recovery. 


ldint.b r0, 


rdest 


// Second INTA cycle 
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Figure 5.8. Pipelined Back-to-Back Read and Write Cycles 



Figure 5.9 shows the interrupt acknowledge cycles 
generated by the code sequence. Interrupt acknowl- 
edge cycles are generated in locked pairs. The inter- 
rupt vector is returned during the second cycle. Each 
of the interrupt acknowledge cycles is terminated 
when the external system responds by asserting 
BRDY#. Wait states -can be added by withholding 
BRDY#. There must be a number of idle clocks be- 
tween the first and second cycles to allow for 8259A 
recovery time. The software controls the number of 
intervening clocks via the number of nop instruc- 
tions in the interrupt acknowledge routine. 

5.1.5 SPECIAL BUS CYCLES 

The i860 XP microprocessor provides a special cy- 
cle to indicate to the external system that certain 



internal conditions have occurred. The special bus 
cycle (indicated by M/IO# = 0, D/C# = 0, and 
W/R# = 1) is generated by the i860 XP microproc- 
essor as a response to scyc instruction execution. 
This cycle (defined in Table 5.3) is used to flush or 
invalidate a secondary cache. The defined value of 
byte enables can be generated by using an appropri- 
ate address operand in the scyc instruction. The 
scyc instruction does not have any effect on the 
internal caches. External hardware must acknowl- 
edge a special bus cycle by asserting BRDY# once. 
The data driven on the data bus with BRDY# is 
undefined. The effect of scyc is determined by de- 
coders in external hardware. 
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Figure 5.9. Example Interrupt Acknowledge Sequence 
Table 5.3. Encoding of Special Bus Cycles 



BE7#-BE0# 


Special Bus Cycle 


11110111 
11111011 
11111101 
1111111 


Write Back External Cache and Invalidate 

Halt 

Invalidate External Cache 

Shut Down 



All other encodings are reserved 

5.2 Bus Arbitration 

The i860 XP microprocessor responds to three dif- 
ferent signals that tell it to stop driving the bus: 

HOLD Finishes outstanding cycles before giving 
up the bus. 

BOFF# Aborts outstanding cycles and gives up bus 
immediately. 

AHOLD Stops driving address bus and permits a 
cache inquiry. 

AHOLD results in a partial hold state, which is cov- 
ered in Section 5.3. The present section concen- 
trates on HOLD and BOFF#. 

When in a hold state (due either to HOLD or 
BOFF#), the i860 XP microprocessor uses BREQ to 
request control of the bus. If holding due to HOLD, 
AHOLD, or BOFF#, the processor activates BREQ 
in the clock after an internal bus request is generat- 



ed. (In the case of HOLD, BREQ is asserted even 
though HLDA is asserted.) If holding due to BOFF# 
and cycles need to be restarted or there is a new 
internal request, it asserts the BREQ signal within 
four clock periods after the assertion of BOFF#. In 
all cases, BREQ remains active at least until the 
clock after ADS# is activated for the requested cy- 
cle. 



5.2.1 HOLD AND HLDA ARBITRATION 

HOLD indicates to the i860 XP microprocessor that 
another bus master needs control of the bus. When 
HOLD is asserted, the i860 XP microprocessor 
keeps control of the bus until all outstanding cycles 
are completed. Then it floats the output signals (ex- 
cept BREQ, HLDA, LOCK#, PCHK#, HIT#, and 
HITM#) and asserts HLDA. These outputs remain at 
the high-impedance state until HOLD is deasserted. 
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HLDA may be asserted as soon as the clock period 
after the one in which HOLD is asserted. HLDA may 
be deasserted as soon as the clock after the one in 
which HOLD is deasserted. 

An example HOLD/HLDA transaction is shown in 
Figure 5.10. The i860 XP microprocessor recognizes 
HOLD even while RESET is asserted, and it drives 
HLDA in this case as well. 

HOLD is recognized even when BOFF# is active, 
and the i860 XP microprocessor responds with 
HLDA the same as when the bus is idle. 



5.2.2 BUS CYCLE BACK-OFF AND RESTART 

The i860 XP microprocessor provides the ability to 
abort bus cycles and restart them again. It is neces- 
sary to abort cycles for reasons such as the follow- 
ing: 

1. Retry after an error is detected by ECC or parity 
logic. 

2. Escape from a deadlock; for example, when the 
i860 XP microprocessor is using A31 -A3 to load 
a new cache line, but the 82495XP cache con- 
troller needs A31 -A5 to invalidate a line in the 
CPU cache which the 82495XP cache controller 
is replacing in its cache in order to satisfy the 
CPU's line-fill request. 



3. Maintain cache consistency; for example, the 
i860 XP microprocessor is attempting to read or 
write to a line that has been modified in the cache 
of another CPU. 

4. Prevent illegal access to an address already 
locked by another GPU in a multiprocessor sys- 
tem. 



5.2.2.1 Cycle Back-Off 

Bus cycles are aborted when the system asserts 
BOFF#. The i860 XP microprocessor samples this 
pin in every clock period that it is driving the bus. 
When BOFF# is asserted, the i860 XP microproces- 
sor immediately (in the next clock period) floats the 
bus. It floats the ADS# pin one clock period later, 
thereby giving time for ADS# to be deasserted so 
that it is not left floating active. The i860 XP micro- 
processor floats the same pins as for HOLD, but 
HLDA is not asserted. If a bus cycle is in progress at 
the time BOFF# is asserted, the cycle is aborted, 
and, in a read cycle, any data returned to the proc- 
essor while BOFF# is active is ignored. BOFF# 
overrides BRDY#; so, if both are sampled active in 
the same clock, BRDY# is ignored. BOFF# aborts 



CLK 



ADS# 



LEN 



CACHE# 



HLDA# 



BREQ 




Figure 5.10. HOLD/HLDA Handshake 



2-66 



ante!. 



i860TM XP MICROPROCESSOR 



[PISBIJMMf 



a burst cycle even if it arrives with the last BRDY# 
of the cycle. However, for read bursts, data transfers 
completed before assertion of BOFF# are used by 
the processor if they satisfy an internal request. 
Cacheable data is cached in spite of BOFF#; how- 
ever, the cached data is overwritten when the cycle 
is restarted. 

The bus remains in the high-impedance state until 
BOFF# is deasserted. If cycles need to be restarted 
or if a new internal request has been generated, the 
BREQ signal is asserted within four clock periods 
after the assertion of BOFF#. 

5.2.2.2 Cycle Restart 

When the system deasserts BOFF#, the i860 XP 
microprocessor restarts aborted bus cycles from the 
beginning by driving the address and status (A31 - 
A3, W/R#, D/C#, etc.) and asserting ADS#. If 
more than one cycle was outstanding when BOFF# 
was asserted, the i860 XP microprocessor restarts 
all outstanding cycles in the same order. If HITM# is 
active due to an inquiry, the write-back for it will be 
the first cycle after deassertion of BOFF#. BOFF# 
restarts all aborted cycles except: 

o The stale cycles mentioned in section 5.3.5. 

o The read that may have been generated by an 
alias hit (virtual tag miss, but physical tag hit). 

© The read that may have been generated by a 
pf Id that hit the data cache. 

If the processor's KEN# pin was active (with NA# 
or first BRDY#) before the cycle was aborted, exter- 
nal hardware must activate it again after the cycle is 
restarted. In other words, the system cannot use 
BOFF# to change the cacheability of a cycle via 
KEN#. 

The LOCK# signal is not affected by restarted cy- 
cles; it retains its state in spite of BOFF# assertion. 

5.2.2.3 Late Back-Off Modes 

In some cases the logic that needs to assert 
BOFF# cannot make the necessary decision in time 
to cancel the relevant cycle or data transfer. For ex- 
ample: 

1 . The result of checking ECC or parity may not be 
available until one or two cycles after the BRDY# 
to which it corresponds. 

2. When the i860 XP microprocessor is attempting 
to read or write to a line that might be modified in 
the cache of another processor on the same bus, 
it may be advantageous to let part of a burst run 



in parallel with inquiries to the other processors, 
rather than delay the entire burst until the inquir- 
ies are finished. 

For such situations, the i860 XP microprocessor pro- 
vides late back-off mode. For a read cycle in this 
mode, the processor employs a buffer to internally 
delay data and BRDY#, which allows BOFF# as- 
sertion to be delayed relative to the external 
BRDY#. Likewise, for a write cycle in this mode, 
BOFF# assertion can be delayed relative to 
BRDY#. However, data for a write cycle is not de- 
layed. 

Two flavors of late back-off mode are provided: 

1. One allows BOFF# to be delayed by one clock 
period relative to the data transfer. The proces- 
sor enters one-clock late back-off mode when 
the FLINE# pin has been sampled active for at 
least three clock periods when RESET deacti- 
vates. 

2. The other allows BOFF# to be delayed by up to 
two clock periods relative to the data transfer. 
The i860 XP microprocessor enters this mode 
when software sets the LB bit of the dirbase 
register. 

If the processor enters one-clock late back-off mode 
during RESET, it is impossible to enter two-clock 
late back-off mode. The LB bit has no effect. Fur- 
thermore, software cannot exit two-clock late back- 
off mode once it is activated, and the LB bit cannot 
be cleared except by resetting the processor. 

Figures 5.12-5.17 illustrate variations on late back- 
off mode cycles. BOFF# can be (and usually is) as- 
serted longer than one clock period, as Figure 5.11 
shows; the remaining figures show an active time of 
only one clock. 

5.2.2.4 One-Clock Late Back-Off Mode 

In one-clock late back-off mode the data is delayed 
internally by one clock before it is used. 

In this mode, data and BRDY# are seen by internal 
logic one clock period later than they appear on the 
bus, which is equivalent to adding an extra wait state 
to reads on the external bus (Figure 5.13). All re- 
sponses to BRDY# (assertion of the ADS# for the 
next cycle, assertion of HLDA in response to a 
HOLD request, and deassertion of HITM#) are de- 
layed by one clock period compared to the normal 
mode of operation. Not delayed, however, are write 
data on D63-D0 and sampling of KEN# and WB/ 
WT#. KEN# and WB/WT# must be valid with the 
first BRDY# assertion. Also, the response to NA# 
(assertion of ADS#) is not delayed if fewer than 
three pipelined cycles are outstanding. 
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NOTES: 

A Noncacheable, 64-bit cycle (one transfer) 
B Next cycle (any type) 

1 . BOFF# cancels cycle and data transfer 

2. Cycle A restarts one clock after BOFF# is deasserted 

3. Earliest ADS# assertion for next cycle 



Figure 5.11. Normal Back-Off 




NOTES: 

A Noncacheable, 64-bit cycle (one transfer) 
B Next cycle (any type) 

1. BOFF# cancels cycle and data transfer 

2. Cycle A restarts one clock after BOFF# is deasserted 

3. Earliest ADS # assertion for next cycle 



Figure 5.12. One-Clock Normal Back-Off 
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NOTE: 

1. Idle clock due to internal delay of BRDY# 




Figure 5.13. Fastest Nonpipelined Cycles in One-Clock Late Back-Off Mode 



If BOFF# is asserted as late as the second BRDY# 
(Figure 5.14), it cancels the entire cycle, ignores 
data latched with the first BRDY#, and ignores the 
data being driven with the second BRDY#. This is 
true of a two-transfer burst (shown) as well as a four- 
transfer burst (not shown). 

In a two-transfer burst, if BOFF# is asserted in the 
clock after the second BRDY# (Figure 5.15), it still 
cancels the cycle. 

In a four-transfer burst, if BOFF# is asserted within 
one clock after the last BRDY# (Figure 5.16), it still 
forces a retry of the cycle, but previously transferred 
read data is used by the processor if it satisfies the 
read request. 

5.2.2.5 Two-Clock Late Back-Off Mode 

Two-clock late back-off mode gives external logic 
even more time to decide to use BOFF#. In this 



mode, data delivery is delayed by either one or two 
clock periods, depending on external activity. For 
any BRDY#, the data is delayed by one clock peri- 
od. If in the next clock period BRDY# is again as- 
serted, the previous data is used. However, if in that 
next clock period BRDY# remains inactive, the data 
is delayed for one extra clock period before it is 
used. The responses to BRDY# (assertion of the 
ADS# for the next cycle, assertion of HLDA, and 
deassertion of HITM#) are delayed by one or two 
clock periods, depending on the value of BRDY# in 
the next clock. The response to NA# (assertion of 
ADS#) is not delayed if fewer than three pipelined 
cycles are outstanding. 

The st.c dirbase instruction that sets the LB bit 
must be aligned on a 32-byte boundary and must be 
followed by seven nop instructions. Software must 
not enable late back-off mode when the processor is 
used with the 82495XP external cache controller. 
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NOTES: 

A Noncacheable, 1 28-bit cycle (two transfers) 
B Next cycle (any type) 

1. BOFF# cancels both transfers (A1 in buffer, A2 on D63-D0) 

2. Cycle A restarts one clock after BOFF# is deasserted 

3. Earliest ADS# assertion for next cycle 



Figure 5.14. One-Clock Late Back-Off Mode (Case 1) 
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NOTES: 

A Noncacheable, 128-bit cycle (two transfers) 
B Next cycle (noncacheable) 

1. BOFF# cancels both transfers (A2 in buffer is needed to satisfy request) 

2. Cycle A restarts one clock after BOFF# is deasserted 

3. Earliest ADS# assertion for next cycle 



Figure 5.15. One-Clock Late Back-Off Mode (Case 2) 
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NOTES: 

A Cacheable 64-bit (or less) cycle (four transfers) 
B Next cycle (any type) 

1. BOFF# cancels A2 and A3 transfers, but A1 transfer has already satisfied request 

2. Cycle A restarts one clock after BOFF# is deasserted 

3. Earliest ADS# assertion for next cycle 



Figure 5.16. One-Clock Late Back-Off Mode (Case 3) 
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Figure 5.17. Two-Clock Late Back-Off Mode 



5.3 Cache Inquiry Cycles (Snooping) 

Another processor initiates an inquiry cycle to check 
whether an address is cached in the internal data or 
instruction cache of the i860 XP microprocessor. An 
inquiry cycle differs from any other cycle in that it is 
initiated externally to the i860 XP microprocessor, 
and the signal for beginning the cycle is EADS# (Ex- 
ternal Address Status) instead of ADS#. The ad- 
dress bus of the i860 XP microprocessor is bidirec- 



tional in order to allow the address of inquiry to be 
driven by the system. An inquiry cycle can begin dur- 
ing any hold state: 

1. While HOLD and HLDA are asserted. 

2. While BOFF# is asserted. 

3. While AHOLD (address hold) is asserted. 
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If neither a HOLD nor a BOFF# is in effect, the sys- 
tem can assert AHOLD to interrupt the current bus 
activity. 

EADS# is first sampled two clocks after BOFF# or 
AHOLD assertion, or one clock after HLDA. This al- 
lows time for the processor to float A31 -A5 and for 
the system to stabilize the inquiry address there. 

In the clock in which EADS# is asserted, the 
i860 XP microprocessor samples these inputs, 
which qualify the type of inquiry: 

INV Specifies whether the line (if found) must 
be invalidated (that is, changed to l-state). 

FLINE# Specifies whether the line (if found in In- 
state) must be written back immediately or 
after outstanding bus cycles are complet- 
ed. 

The i860 XP microprocessor compares the address 
of the inquiry request with addresses of lines in 
cache and of any line in the write-back buffer waiting 



to be transferred on the bus. It does not, however, 
compare with the address of write-miss data in the 
write buffers. Two clock periods after sampling 
EADS#, the i860 XP drives the results of the inquiry 
look-up on these output pins: 

HIT# Specifies whether the address was found 
(active) or not found (inactive). 

HITM# If active, the line found was in the M-state; 
if inactive, the line was in E- or S-state, or 
was not found. 

Figure 5.18 shows an inquiry with AHOLD that miss- 
es the cache. When the system asserts AHOLD, the 
i860 XP microprocessor floats A31-A3 in the next 
clock period. It does not, however, assert HLDA; no 
acknowledge is required. Once the address pins are 
floating, external logic drives the address for the in- 
quiry on A31 -A5 and starts the inquiry cycle by acti- 
vating EADS#. The i860 XP microprocessor does 
not begin sampling EADS# until the second clock 
after AHOLD is activated. EADS# activation may be 
delayed any number of clocks. 
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NOTES: 

A Outstanding cycle (for example, a single-transfer read) finishes during the inquiry 

1. Earliest assertion of EADS# is two clocks after assertion of AHOLD 

2. Earliest deassertion of AHOLD is one clock after assertion of EADS# 

3. HIT# is valid two clocks after assertion of EADS# 

4. Earliest assertion of ADS# for next cycle is one clock after deassertion of AHOLD 



Figure 5.18. inquiry Miss Cycle 
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The earliest that AHOLD can be deasserted is the 
clock after EADS# assertion. However, by maintain- 
ing AHOLD active, multiple inquiry cycles can be ex- 
ecuted in one AHOLD session (Figure 5.19). The 
i860 XP microprocessor can accept inquiry cycles at 
a rate of one every other clock period, unless a 
write-back is required. The earliest that ADS# can 
be asserted for the next cycle is the clock after 
AHOLD deassertion. 

The second inquiry in Figure 5.19 hits an unmodified 
line in the cache. When a cache line with matching 
address is found and the INV input signal is asserted 
(as in this case), that line is invalidated (changed to 
l-state). If the INV signal is inactive, the line enters 
S-state. 



5.3.1 INQUIRY WRITE-BACK CYCLES 

If an inquiry cycle hits a dirty (M-state) line in the 
i860 XP microprocessor cache, the i860 XP micro- 
processor asserts the HITM# signal to indicate that 
the line will be written on the bus. The HITM# output 
becomes valid in the same clock period as HIT#. In 
this case the modified line is written out, and the 
cache entry is changed to either I or S state accord- 
ing to INV. The HITM# signal stays active through 
the last BRDY# for the corresponding write-back 
cycle. 

An inquiry write-back cycle is similar to ordinary 
write-back cycles. It is initiated by assertion of 
ADS#. ADS# is asserted even when the AHOLD 
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NOTES: 

A Outstanding cycle (for example, a single-transfer read) finishes during the inquiry 

B Earliest inquiry, no invalidation 

C Earliest successive inquiry, with invalidation 

1. EADS# is not sampled in the clock after its assertion 

2. Inquiry B misses cache 

3. Earliest deassertion of AHOLD is one clock after last assertion of EADS# 

4. Inquiry C hits cache, invalidates line 

5. Earliest assertion of ADS# for next cycle is one clock after deassertion of AHOLD 



Figure 5.19. Fastest Inquiry Cycles (Miss and Hit) 
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signal is active. The cycle definition signals are driv- 
en properly by the processor, however, the address 
pins are not driven, because activation of AHOLD 
forces the i860 XP microprocessor off the address 
bus. If, however, AHOLD is deasserted before or 
during the write-back cycle, the i860 XP microproc- 
essor drives the correct address for the write-back. 

For all types of inquiry, the write-backs are not pipe- 
lined into an outstanding cycle, except when the 
FLINE# pin is used (refer to section 5.3.5). ADS# 
for the inquiry write-back is asserted from one to four 



clock periods after the HITM# pin is driven active or 
after the last BRDY# is returned for any outstanding 
cycle, whichever occurs later. 

Bursts for a HITM# write-back, as for any write- 
back, are in the order 0, 8, 0x10, 0x18, because the 
i860 XP microprocessor ignores A4-A3 of the in- 
quiry address. 

Figure 5.20 shows an inquiry cycle that hits an In- 
state line. 
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NOTES: 

A Outstanding cycle (for example, a single-transfer read) 
W Write-back cycle 

1. EADS# is not sampled while HITM# is active 

2. Earliest ADS# assertion if not delayed by outstanding cycle 

3. ADS# for write-back delayed by outstanding cycle 

4. HITM# deactivates after last BRDY# of write-back 



Figure 5.20. Inquiry Hit Cycle with Write-Back 
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The fact that a write-back cycle is initiated while ad- 
dress lines are floating supports multiple inquiries 
(with write-backs) during a single AHOLD session. 
This is especially useful during secondary cache re- 
placement processing, when the secondary-cache 
line is larger than that of the i860 XP microproces- 
sor. 

Note that EADS# is ignored as long as HITM# is 
active. If the system is executing a series of inquir- 
ies, it might happen that the HITM# assertion for 
one inquiry masks the EADS# for a subsequent in- 
quiry. In that case the system must reassert EADS# 
to restart the masked inquiry. 

Inquiries can occur during a hold due to HOLD/ 
HLDA or BOFF#. However, in these cases, the cy- 
cle definition pins and ADS# are floating. If an in- 
quiry requires a write-back, the HOLD or BOFF# 
must be deasserted so that the cycle definition pins 
and ADS# can be driven to start the write-back cy- 
cle. If HITM# is active at the time of ADS#, the first 
ADS# issued after HOLD is deasserted corre- 
sponds to the write-back of the modified line which 
was snooped. 

5.3.2 SNOOPING RESPONSIBILITY LIMITS 

The i860 XP microprocessor takes responsibility for 
responding to inquiry cycles for a cache line only 
during the time that the line is actually in the cache 
or in a write-back buffer. There are times during the 
cache line fill cycle and during the cache replace- 
ment cycle when the line is "in transit", and inquiry 
(snooping) responsibility must be taken by other sys- 
tem components. 

Systems designers should consider the possibility 
that an inquiry cycle may arrive at the same time as 
a cache line fill or replacement for the same ad- 
dress. This situation can occur: 

® In multiprocessor systems that have external 
(secondary) caches with separate CPU and 
memory busses, thereby allowing concurrent ac- 



tivity on the two busses. In such systems, it is 
desirable to run invalidation cycles concurrently 
with other i860 XP microprocessor bus activity. It 
can happen that writes on the memory bus cause 
invalidation requests to the i860 XP microproces- 
sor at the same time that the i860 XP microproc- 
essor fetches data from the secondary cache. 
Such events can occur at any time relative to 
each other. 

o In multiprocessor systems with no secondary 
cache, if memory is dual-ported. In such systems, 
two processors can simultaneously read the 
same line, each sending an inquiry to the other. 

The simultaneous activities considered here may be 
for different data items in the same cache line. Un- 
less the inquiry request is timed carefully with re- 
spect to the cache fill cycle, the cache-consistency 
mechanism may be subverted, and data inconsist- 
encies may result (for example, both CPUs may get 
the line in E-state on a read). If the 82495XP and 
82490XP cache is being used, the timing with re- 
spect to the i860 XP microprocessor is handled cor- 
rectly by the cache controller; however, the same 
problem may arise between the memory system and 
the secondary cache. 

There are two cases to consider: 

1. Inquiry for a line that is being cached. 

2. Inquiry for a line that is being replaced. 

5.3.2.1 Inquiry for a Line Being Cached 

The i860 XP microprocessor accepts an inquiry cy- 
cle at any time, even if it hits the line being cached at 
that time. Regardless of the timing of the cycle, the 
i860 XP microprocessor delivers the read data to the 
load instruction that initiated the read request. How- 
ever, the timing of the invalidation cycle determines 
whether the line is placed in the cache and what 
value the i860 XP microprocessor drives on HIT#. 
Table 5.4 summarizes the different cases. 




Table 5.4. Inquiry for a Line being Cached 





EADS# before 

orwithNA# 

or1stBRDY# 


EADS# after 

NA# or 
1stBRDY# 


Line is cached? 


YES 


NO 


HIT# = 


Inactive 


Active 


Data/ Instruction 
used by CPU? 


YES 


YES 
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If EADS# is asserted before or with the sampling of 
KEN #, the processor cannot match the address of 
the line being cached with an invalidation request. 
Thus, the processor does not assert HIT#. The ex- 
ternal system must satisfy the inquiry with the cor- 
rect data and WB/WT# status. If invalidation of that 
line is required, the system must do one of the fol- 
lowing: 

• Delay assertion of EADS# until one clock after 
assertion of KEN#. 

• Reassert EADS# after KEN#. 



• Make KEN# inactive at the first BRDY# or NA#, 
thereby preventing the line from being cached. 

Figures 5.21 and 5.22 show when the i860 XP micro- 
processor picks up responsibility for inquiries for a 
line that it is caching. Figure 5.21 shows the earliest 
EADS# assertion that invalidates the line being 
cached relative to the first BRDY# for nonpipelined 
cycles. Figure 5.22 shows the earliest EADS# as- 
sertion that invalidates the line being cached relative 
to the first NA# for pipelined cycles. These timings 
hold for normal and late back-off modes. 
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NOTES: 

A Cache line fill cycle 

S Snoop (inquiry) cycle 

R Addresses of cache line fill and snoop are the same 

1. Earliest EADS# assertion that can invalidate line being filled 



Figure 5.21. Snoop Responsibility Pickup (Nonpipelined Cycle) 
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NOTES: 

A Cache line fill cycle 

B Next cycle (any type) 

S Snoop (inquiry) cycle 

R Addresses of cycles A and S are the same 

1. Earliest EADS# assertion that can invalidate line being filled 




Figure 5.22. Snoop Responsibility Pickup (Pipelined Cycle) 



5.3.2.2 Inquiry for a Line Being Replaced 

When the i860 XP microprocessor is replacing a line, 
there are two cases: 

1. If the replacement does not require write-back, 
the address being replaced can be matched by 
an inquiry until assertion of NA# or first BRDY# 
of the line-fill cycle. From that point on, the in- 
quiry has no effect. 

2. If the replacement requires a write-back, the ad- 
dress being replaced can be matched by an in- 
quiry until assertion of the last BRDY# for the 
write-back. An EADS# as late as two clocks be- 
fore the last BRDY# can cause HITM# to be 
asserted. 



Figures 5.23 through 5.25 show when the i860 XP 
microprocessor drops responsibility for recognizing 
inquiries for a line that it is writing back. They show 
the latest EADS# assertion that can cause HITM# 
assertion. In late back-off mode, EADS# can be as- 
serted later, because BRDY# is internally delayed 
(Figures 5.24 and 5.25). 

In all these cases, HITM# remains active for only 
one clock period. HITM#, as always, remains active 
through the last BRDY# of the corresponding write- 
back; in these cases the write-back has already 
completed. 

If an inquiry cycle hits the write-back address after 
its ADS# has been issued, the i860 XP microproc- 
essor asserts HITM#; however, HIT# is deassert- 
ed. This unique combination of values on HIT# and 
HITM# indicates that the write-back cycle corre- 
sponding to the HITM# has already been issued. 
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NOTES: 

A Write-back cycle 

S Snoop (inquiry) cycle 

R Addresses of cycles A and S are the same 



Figure 5.23. Latest Snooping of Write-Back (Not Late Back-Off Mode) 
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NOTES: 

A Write-back cycle 

S Snoop (inquiry) cycle 

R Addresses of cycles A and S are the same 



Figure 5.24. Latest Snooping of Write-Back (One-Clock Late Back-Off Mode) 
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NOTES: 

A Write-back cycle 

S Snoop (inquiry) cycle 

R Addresses of cycles A and S are the same 




Figure 5.25. Latest Snooping of Write-Back (Two-Clock Late Back-Off Mode) 



5.3.3 



WRITE CYCLE REORDERING DUE TO 
BUFFERING 



The MESI cache protocol and the ability to perform 
and respond to inquiry cycles guarantee that writes 
to the cache are logically equivalent to writes that go 
to memory. In particular, the order of read and write 
operations on cached data is the same as if the op- 
erations were on data in memory. Even uncached 
memory read and write requests usually occur on 
the external bus in the same order that they are is- 
sued in the program. For example, when a write miss 
is followed by a read miss, the write data goes onto 
the bus before the read request is put on the bus. 
However, the posting of writes in write buffers cou- 
pled with inquiry cycles may cause the order of 
writes seen on the external bus to differ from the 
order they appear in the program. Consider the fol- 
lowing example, which is illustrated in Figure 5.26: 

1 . Three bus cycles are outstanding. 

2. Processor 1 executes a store to address A, which 
misses the cache. This store is posted; that is, 
the data is latched in the write buffer while the 
processor continues execution without waiting for 
the store to be completed on the bus. In this case 
the store is not even put on the bus because 
there are already three outstanding cycles. 



3. Processor 1 executes a store to address B, which 
hits the cache. 

4. Processor 2 executes an inquiry for address B. 
Processor 1 looks in its' cache, finds the modified 
line, asserts HIT# and HITM#, and executes a 
write-back cycle to address B, while the data for 
address A is still in the write buffer. 

5. Processor 1 issues the write to address A on the 
bus. 

In this example, the original order of the writes has 
been changed. In most cases it is not necessary that 
the ordering of writes be strictly maintained. But 
there are cases (for example, semaphore updates in 
a multiprocessor system) that require stores to be 
observed externally in the same order as pro- 
grammed. There are several ways to ensure seriali- 
zation of stores: 

1. Bracket one of the stores with the lock and 
unlock instructions. That forces serialization of 
the stores (refer to section 5.4). In the above ex- 
ample of a store-miss followed by store-hit, lock- 
ing either store would ensure that the. internal 
store-hit does not update the cache until the miss 
gets to the external bus. 

2. Apply the write-through policy to the critical data, 
by setting WT= 1 in the page table entries or by 
driving the WB/WT# pin low. 
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NOTES: 

A Data written by st.x A instruction 
B Data written by st.x B instruction 

1 . Snoop for address of B 

2. Snoop look-up in tag array occurs here; finds B modified 

3. Write-back of line containing B occurs before write of A 



Figure 5.26. Write Reordering due to Buffering 



3. Configure the processor for Strong Ordering 
Mode by asserting EWBE # during RESET. 

Option 1 is implementable by user-level programs, 
while option 2 is an operating-system level solution, 
not directly implementable by user-level code. Op- 
tion 3, the hardware solution, is discussed in greater 
detail in section 5.3.4. 



5.3.4 STRONG ORDERING MODE 

In strong ordering mode, the processor delays up- 
dates to its internal data cache in either of these 
conditions: 

1. The internal write buffer is not empty. 

2. An external write buffer is not empty (the external 
system signals this condition by deactivating the 
EWBE# signal). 

By delaying the cache update until all write buffers 
are empty, the i860 XP microprocessor avoids the 
out-of-order sequence shown in section 5.3.3. 



In strong ordering mode, EWBE# can be reassert- 
ed only between the ADS# and the last BRDY# of 
a store. The earliest deassertion is the clock after 
ADS#; the latest deassertion is together with the 
last BRDY#. EWBE# can be reasserted at any 
time, except when the processor is performing an 
inquiry write-back. In other words, EWBE# must not 
activate while HITM# is active. When EWBE# goes 
active, the processor completes any cache update 
that may have been delayed by its deassertion. 

Figure 5.27 shows how an external cache can use 
EWBE# when a store miss in the i860 XP micro- 
processor is also a miss in the external cache. 

An external cache controller should also refrain from 
updating the external cache while EWBE# is active. 
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CLK 



W/R# 



BRDY# 



EWBE# 




NOTE: 

1 . Assumes the external cache needs five cycles to write the data to memory. 

2. Pending internal data cache updates are delayed until the clock in which EWBE# is sampled LOW. 




Figure 5.27. Timing of EWBE^ 



5.3.5 



SCHEDULING INQUIRY WRITE-BACK 
CYCLES 



In order to preserve system-wide ordering of memo- 
ry transactions in multiprocessor systems that have 
a pipelined or split-transaction memory bus, it may 
be necessary to get the data corresponding to an 
inquiry hit before outstanding bus cycles are com- 
pleted. Another bus master can always request an 
inquiry while the i860 XP microprocessor has cycles 
outstanding on the bus. However, when AHOLD is 
asserted, the i860 XP microprocessor normally com- 
pletes outstanding cycles before it performs any 
write-back that may be required. The i860 XP micro- 
processor provides two methods for causing the in- 
quiry write-back before outstanding cycles are com- 
pleted: 

FLINE# When FLINE# is asserted during the 
EADS# of an inquiry that hits an M-state 
line, the i860 XP microprocessor issues a 
write-back cycle and writes the dirty line to 
memory before the outstanding bus cycles 
are completed. 

BOFF# If there are outstanding cycles on the bus, 
asserting BOFF# clears the bus pipeline. 
If an inquiry causes HITM# to be asserted, 
then the first cycle issued by the i860 XP 
microprocessor after deassertion of 
BOFF# is the inquiry write-back cycle. Af- 
ter the inquiry write-back, it reissues the 
aborted cycles. 

5.3.5.1 Choosing between FLINE# and BOFF# 

FLINE#, although the more efficient choice, cannot 
handle all situations. Under certain circumstances, it 
can happen that outstanding stores on the bus cor- 



respond to data that is obsolete relative to the data 
in the cache, because a subsequent store has up- 
dated the cache after the ADS# for the outstanding 
store has occurred. For example: 

• An aliasing store hit, in which a cache virtual-tag 
miss occurs and the ADS# is issued at the same 
time as a physical-tag hit. Then the cached data 
would be updated before external memory, and a 
subsequent store to the new virtual address 
could also update cache before the outstanding 
bus store completed. 

• Back-to-back writes to the same line can also up- 
date the cache more recently than the bus when 
the write-once update policy is employed. The 
first write updates the cache and generates a bus 
write request, but the second write only updates 
the cache. 

In both of these examples the outstanding stores on 
the bus are obsolete relative to the data in the cache 
line. If an inquiry cycle hits a line and this line is 
written back out of order (that is, before outstanding 
stores are completed), special care should be taken 
to discard the outstanding stores. 

The easiest way to avoid this situation is not to as- 
sert FLINE# when stores are outstanding, but use 
BOFF# instead. If out-of-order write-back is imple- 
mented with BOFF#, the i860 XP microprocessor 
does not restart the outstanding store to that line if 
such a store has been obsoleted by a later cache hit 
store. That is, the i860 XP microprocessor detects 
this condition and kills the obsolete data. However, 
lock-bracketed stores (including the last store in the 
lock sequence) are restarted by the i860 XP micro- 
processor, because lock-bracketed stores update 
the cache only after BRDY# is returned. 
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If, on the other hand, out-of-order write-back is im- 
plemented by using only the FLINE# pin, the exter- 
nal system must return BRDY#s for outstanding 
stores, but the data must be ignored if it has already 
been written out by an inquiry write-back. , 

Note that if a replacement write-back is in progress 
(ADS# has been issued, but last BRDY# has not 
occurred) and an inquiry hits the same line that is 
being written back, the FLINE# pin is ignored. The 
system can recognize this special case by the fact 
that HITM# is asserted while HIT# is deasserted. If 
other cycles are outstanding and it is necessary to 
write the line back before the other cycles, BOFF# 
can be used. 



5.3.5.2 Reordering Write-Backs with FLINE# 

FLINE# must be active during the EADS# that initi- 
ates an inquiry. BRDY# must not be asserted forthe 
previously issued cycles while HITM# is active. If 
HITM# is asserted while the data transfer of the 
outstanding cycle is in progress (i.e. first BR DY# 
has been asserted, but the entire transfer has not 



yet been completed), the i860 XP microprocessor 
waits for the current cycle to complete, and only 
then issues the write-back. After the last BRDY# for 
the ongoing burst (if any), BRDY# is ignored until 
the clock period after ADS # is asserted for the 
write-back. 

From the viewpoint of the i860 XP microprocessor, 
an inquiry write-back cycle is just another bus cycle; 
so, if there is an outstanding cycle at the time of 
FLINE# and HITM# activation, the system must as- 
sert NA# to initiate the write-back. 

Figure 5.28 illustrates simple cycle reordering, when 
FLINE# is not asserted during the data transfer of 
another cycle. The outstanding request could be ei- 
ther a read or write. 

Figure 5.29 shows the case in which FLINE# is as- 
serted after data transfer for the outstanding cycle 
has already started. In this case, the i860 XP micro- 
processor does not issue a write-back until the out- 
standing transfer is completed. NA# is needed in 
this example only if other outstanding cycles remain. 
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Figure 5.28. Cycle Reordering via FLINE# (No Ongoing Burst) 
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NOTES: 

1. BRDY# is ignored by CPU from end of ongoing burst through ADS# of write-back, even if other cycles remain 
outstanding 

2. NA# required only if another cycle is outstanding 

3. If the first BRDY# is asserted here or sooner (relative to HITM#), the outstanding cycle completes before the 
FLINE# write-back. 




Figure 5.29. Cycle Reordering via FLINE# (Ongoing Burst) 
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5.3.5.3 Reordering Write-Backs with BOFF# 

Back-off cycles are discussed in general in Section 
5.2.2. Figure 5.30 shows how BOFF# can be used 
to cancel outstanding cycles so that an inquiry write- 
back can take place immediately. 



5.4 The LOCK# Cycle Attribute 

The processor asserts the LOCK# signal when sev- 
eral accesses to a single memory location must be 
effectively uninterruptible. By causing LOCK # to be 
asserted, a programmer can, for example, increment 
the contents of a memory variable and be assured 
that the variable will not be accessed between the 
read and the update of that variable. 



EADS# r 




-<_DGD 



-ODGDODGD-t-QQC 



NOTES: 

A Outstanding cycle (for example, noncacheable 128-bit read) W Write-back cycle 

1. AHOLD begins an inquiry while one cycle is outstanding. 

2. Earliest assertion of EADS# is two clocks after assertion of AHOLD 

3. Inquiry hits modified line. 

4. Assertion of BOFF# aborts the outstanding cycle. 

5. BRDY# asserted during BOFF# is ignored by CPU. 

6. Write-back begins after deassertion of BOFF#. 

7. Earliest assertion of ADS# for restart of cycle A (assuming no pipelining). 



Figure 5,30. Cycle Reordering via BOFF# (Ongoing Burst) 
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The memory location to be locked is the one whose 
address is driven during the cycle in which LOCK# 
is first activated. In multiprocessor systems, external 
hardware should guarantee that no other processor 
is granted a locked read, locked write, or unlocked 
write to the same location until LOCK# is deassert- 
ed. The i860 XP microprocessor has no hardware 
provision to prevent another master from also lock- 
ing the variable; this responsibility falls on the bus 
arbiter. In the simplest implementation, the arbiter 
can globally prevent other masters from accessing 
the bus. 



microprocessor recognizes bus hold (HOLD), ad- 
dress hold (AHOLD), and back-off (BOFF#) while 
the LOCK# signal is active. In spite of such inter- 
vening conditions, the arbiter should prevent any 
other bus master from also locking or updating the 
variable the i860 XP microprocessor locked. In sim- 
ple systems the HOLD input can be masked by the 
LOCK# output (that is, the external logic that gener- 
ates HOLD can AND the LOCK# signal with other 
hold conditions). More sophisticated systems, how- 
ever, may allow the bus to be turned over while 
LOCK# is asserted. 



Not all cycles affect the value of LOCK#. Code 
fetches, write-backs due to replacement or inquiry, 
and cycles restarted due to BOFF# do not affect 
LOCK#. Any other type of cycle can be used to initi- 
ate or terminate LOCK#, including cache line fills, 
interrupt acknowledge, I/O, and special cycles. 

Data accesses with LOCK# asserted are not pipe- 
lined, and other data cycles are not pipelined while a 
LOCK# cycle remains outstanding. Instruction 
fetches, however, may be pipelined during lock. 

The i860 XP microprocessor can run very long lock 
sequences; therefore, to guarantee reasonable bus 
turnover latency in multimaster systems, the i860 XP 



Whatever the lock implementation, arbiter design 
must, in one case, allow another processor to write 
the locked variable. That case is when another 
i860 XP microprocessor or master asserts HITM# in 
response to the inquiry generated by the locking 
processor's initial read. That other master must write 
back the locked variable before the i860 XP micro- 
processor can read it. This HITM# write-back must 
always be allowed. 

The timing of LOCK# is shown in Figure 5.31. Note 
that LOCK# is asserted in the same clock period as 
ADS# for the locked address, but is deasserted in 
the clock period after ADS # for the unlocking load 
or store. 
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NOTES: 

L Locking access 
U Unlocking access 

1 . This address is to be locked 

2. LOCK# is asserted with ADS# 

3. LOCK# is deasserted one clock after ADS# 



Figure 5.31. LOCK# Timing 
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5.5 RESET Initialization 

Initialization of the i860 XP microprocessor is caused 
when the system asserts the RESET signal for at 
least ten clocks. Table 5.5 shows the status of out- 
put pins during the time that RESET is asserted. 
Note that the bidirectional data pins (D63-D0 and 
DP7-DP0) are floated during RESET, though the bi- 
directional A31-A3 pins are not. If the i860 XP mi- 
croprocessor is used with 82495XP and 82496XP 
cache, however, the latter do float the bidirectional 
pins they share with i860 XP microprocessor during 
RESET. Note that HOLD requests are honored dur- 
ing RESET and that the HLDA output signal may 
also become active. The status of output pins de- 
pends on whether a HOLD request is being acknowl- 
edged. Note also that the test logic may be active 
during RESET and that the EXTEST instruction may 
drive other values on the output pins. 

After the RESET signal goes inactive the processor 
remains in the RESET state for three more clocks. 
Applications that use the HOLD signal to float the 



bus during RESET should keep HOLD active for 
three more clocks after the RESET signal is deacti- 
vated. 

Some aspects of processor configuration are deter- 
mined by asserting input signals during RESET. To 
select a given option, the corresponding input must 
be asserted for at least the last three clocks before 
the falling edge of RESET; to deselect, the corre- 
sponding input must be deasserted for at least the 
last three clocks before the falling edge of RESET: 

EWBE# Enter strong ordering mode. 

FLINE# Enter one clock late back-off mode. 

INT/CS8 Enter eight-bit code-size mode. 

PEN# Enter normal (small output buffers) cur- 
rent mode. 

Figure 5.32 shows how configuration pins are sam- \ 
pled during the three clock periods just before the > 
falling edge of RESET. No inputs besides EWBE#, 
HOLD, FLINE#, INT/CS8, and PEN# are sampled 
during RESET. 



Table 5.5. Output Pin Status during Reset 



Pin Name 


Pin Value 


HOLD 
Not Acknowledged 


HOLD 
Acknowledged 


BREQ 

HLDA 

W/R#,PWT, PCD 

ADS# 

D63-D0, DP7-DP0 

A31 -A3, BE7#0-BE0#, NENE# CACHE#, CTYP, D/C#, 

KBO, KB1, LEN, M/IO#, PCYC 
PCHK#,HIT# 
HITM#,LOCK# 


LOW 
LOW 
LOW 
HIGH 
Tristate OFF 
Undefined 

Undefined 
HIGH 


LOW 

HIGH 
Tristate OFF 
Tristate OFF 
Tristate OFF 
Tristate OFF 

Undefined 
HIGH 



NOTE: 

This table does not apply if the test logic is running the EXTEST instruction. 
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Figure 5.32. Reset Activities 
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While in eight-bit code-size mode, instruction cache 
misses are one-byte reads (transferred on D7-D0 of 
the data bus) instead of eight-byte reads. This allows 
the i860 XP microprocessor to be bootstrapped from 
an eight-bit ROM. For these code reads, byte en- 
ables BE2#-BE0# are redefined to be the low or- 
der three bits of the address, so that a complete 
byte address is available. The entire eight-byte data 
bus continues to be parity-checked by the i860 XP 
microprocessor during CS8-mode instruction fetch- 
es; therefore, external hardware must either gener- 
ate good parity on all eight bytes or disable parity 
traps by deasserting PEN# during CS8 mode. 

While in this mode, instructions must reside in an 
eight-bit wide memory, while data must reside in a 
separate 64-bit wide memory. After the code has 
been loaded into 64-bit memory, initialization code 
can initiate 64-bit code fetches by clearing the CS8 
bit of the dirbase register (refer to section 2). Once 
eight-bit code-size mode is disabled by software, it 
cannot be reenabled except by resetting the i860 XP 
microprocessor. 

Instruction fetches in CS8 mode update, the instruc- 
tion cache if KEN# is asserted during NA# or all of 
the first eight BRDY#s (refer to section 4.2.26). 
They are pipelined if NA# is asserted. When used 
with the 82495XP and 82496XP cache, CS8 mode 
works only if the ROM locations are made non- 
cacheable. 



6.0 TESTABILITY' 

The i860 XP microprocessor provides testability fea- 
tures compatible with the proposed Standard Test 
A ccess Port and Boundary-Scan Architecture (IEEE 
Std. P1 149.1 /D6). The subset of the standard test 
logic implemented in the i860 XP microprocessor 
provides for testing the interconnections between 
the i860 XP microprocessor and other integrated cir- 
cuits once they have been assembled onto a printed 
circuit board. 

The test logic consists of a boundary-scan register 
and other building blocks that are accessed through 
a test access port (TAP). The TAP provides a simple 
serial interface that makes it possible to test all sig- 
nal traces with only a few probes. 

The TAP can be controlled by a bus master. The bus 
master can be either automatic test equipment or a 
component that interfaces to a four-pin test bus. 



6.1 Test Architecture 

The test logic contains the following elements: 

o Test access port (TAP), which consists of input 
pins TMS, TCK, TDI, and TRST#; and output pin 
TDO. 

° TAP controller, which receives the dedicated test 
clock (TCK) and interprets the signals on the test 
mode select (TMS) line. The TAP controller gen- 
erates clock and control signals for the instruc- 
tion and test data registers and for other parts of 
the test logic. 

o Instruction register (IR), which allows instruction 
codes to be shifted into the test logic. The in- 
struction codes are used to select the test to be 
performed or the test data register to be ac- 
cessed. 

° Test data registers: Bypass Register (BPR), De- 
vice Identification Register (DID), and Boundary- 
Scan Register (BSR). 

The instruction and test data registers are separate 
shift-register paths connected in parallel and having 
a common serial data input and a common serial 
data output connected to the TAP TDI and TDO sig- 
nals respectively. 



6.2 Test Data Registers 

The test logic contains the following data registers: 

° Bypass Register (BPR): BPR is a one-bit shift 
register that provides a minimum-length path be- 
tween TDI and TDO when no test operation of 
the component is required. This allows more rap- 
id movement of test data to and from other board 
components that are required to perform test op- 
erations. While running through BPR, the data is 
transferred without inversion from TDI to TDO. 

o Device Identification Register (DID): This reg- 
ister contains the manufacturer's identification 
code, part number code, and version code in the 
format shown by Figure 6.1 . The values are: man- 
ufacturer's identification code (9), part number 
code (61 A0), version code (8), entire 32-bit value 
(0x861 A001 3). 

© Boundary Scan Register (BSR): The BSR is a 

single shift-register path containing 1 50 cells that 
are connected to all input and output pins of the 
i860 XP microprocessor. Figure 6.2 shows the 
logical structure of the BSR. Input cells only cap- 
ture data; they do not affect operation of the 
i860 XP microprocessor. Data is transferred with- 
out inversion from TDI to TDO through the BSR 
during scanning. The BSR can be operated by 
the EXTEST and SAMPLE instructions. 
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Figure 6.1. Format of DID Register 
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Figure 6.2. Logical Structure of BSR Register 



6.3 Instruction Register 

The Instruction Register (IR) selects the test to be 
performed and the test data register to be accessed. 
It is four bits wide, with no parity bit. Table 6.1 shows 
the encoding of the instructions supported by the 
TAP controller of the i860 XP microprocessor. The 
rightmost bit is the least significant and is the first 
shifted out on TDO. 



Table 6.1. TAP Instruction Encoding 


Instruction Code 


Instruction 


0000 


EXTEST Boundary Scan 


0001 


SAMPLE Boundary Scan 


0010 


IDCODE 


0011 ...1110 


Intel reserved CAUTION* 


1111 


BYPASS 



* CAUTION: Operation of these private instructions may 
cause damage to the component. 



EXTEST The BSR cells associated with output pins 
drive the output pins of the i860 XP micro- 
processor. Values scanned into the BSR 
cells become the output values. The BSR 
cells associated with input pins sample 
the inputs of the i860 XP microprocessor. 
Note that I/O pins can be input or output 
for this test, depending on their control 
setting. The values shifted to the input 
latches are not used by the internal logic 
of the i860 XP microprocessor. After use 
of the EXTEST command, the i860 XP mi- 
croprocessor must be reset (with the RE- 
SET signal) before normal use. 

SAMPLE The BSR cells associated with output pins 
sample the value driven by the i860 XP 
microprocessor. BSR cells associated 
with input pins sample on the rising edge 
of TCK the values driven to the i860 XP 



2-88 



Intel. 



J860TM XP MICROPROCESSOR 



raiUMOIMIJW 



microprocessor. BSR cells associated 
with I/O pins sample the value on the re- 
spective pin. The I/O pin can be driven by 
the i860 XP microprocessor or by external 
hardware. The values shifted to the input 
latches are not used by the internal logic 
of the i860 XP microprocessor. 

IDCODE The identification code of the i860 XP mi- 
croprocessor from the DID register is 
passed to TDO. The DID register is not 
altered by data shifted in on TDI. 

BYPASS Test data is passed from TDI to TDO via 
the single-bit BPR, effectively bypassing 
the test logic of the i860 XP microproces- 
sor. Because of its special encoding, this 
instruction can be entered by holding TDI 
HIGH while completing an instruction- 
scan cycle. This reduces the demands on 
the host test system in cases where ac- 
cess is required, for example, only to chip 
57 on a 100-chip board. 

Note that an open circuit fault in the 
board-level test data path causes the 
BPR register to be selected following an 
instruction-scan cycle, because the TDI 
input has a pull-up resistor. Therefore, no 
unwanted interference with the operation 
of the on-chip system logic can occur. 

Table 6.2 defines which registers are active during 
execution of each instruction. 



6.4 TAP Controller 

The TAP Controller is a synchronous, finite state 
machine. It controls the sequence of operations of 
the test logic. The TAP Controller changes state 
only in response to the following events: 

1 . A rising edge of TCK. 

2. A transition to logic zero at the TRST# input. 

3. Power-up. 



The value of the TMS input signal at a rising edge of 
TCK controls the sequence of state changes. The 
state diagram for the TAP controller is shown in Fig- 
ure 6.3. Test designers must consider the operation 
of the state machine in order to design the correct 
sequence of values to drive on TMS. 

6.4.1 TEST-LOGIC-RESET STATE 

In this state, the test logic is disabled so that normal 
operation of the i860 XP microprocessor can contin- 
ue unhindered. This is achieved by initializing the in- 
struction register such that the IDCODE instruction 
is loaded. No matter what the original state of the 
controller, the controller enters Test-Logic-Reset 
when the TMS input is held HIGH for at least five 
rising edges of TCK. The controller remains in this 
state while TMS is HIGH. 

If the controller leaves the Test-Logic-Reset state as 
a result of an erroneous LOW signal on the TMS line 
at the time of a rising edge of TCK (for example, a 
glitch due to external interference), it returns to the 
Test-Logic-Reset state following three rising edges 
of TCK while the TMS signal at the intended HIGH 
logic level. The operation of the test logic is such 
that no disturbance is caused to on-chip systom log- 
ic operation as the result of such an orror. On leav- 
ing the Test-Logic-Reset state, the controller moves 
into the Run-Test/ Idle state, where no action occurs 
because the current instruction has been set to se- 
lect operation of the DID register. The test logic is 
also inactive in the Select-DR-Scan and Select-IR- 
Scan states. 

The TAP controller is also forced to the Test-Logic- 
Reset state by applying a LOW logic level to the 
TRST# input and at power-up. 



Table 6.2. Registers Active by Instruction 



Mode 


Register 


BSR 


DID 


BPR 


EXTEST 
SAMPLE 
IDCODE 
BYPASS 


TDI .-* BSR -> TDO 
TDI -^ BSR -> TDO 

Inactive 

Inactive 


Inactive 

Inactive 

DID — ► TDO 

Inactive 


Inactive 

Inactive 

Inactive 

TDI -> BPR -* TDO 



*$■•*&/« 
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NOTE: 

0,1 The values present on TMS at the time of a rising edge on TCK. 



Figure 6.3. TAP Controller State Diagram 



6.4.2 RUN-TEST/IDLE STATE 

The controller enters this state between scan opera- 
tions. Once in this state, the controller remains in 
this state as long as TMS is held LOW. No activity 
occurs in the test logic. The instruction register and 
all test data registers retain their previous state. 
When TMS is HIGH and a rising edge is applied to 
TCK, the controller moves to the Select-DR-Scan 
state. 



6.4.3 SELECT-DR-SCAN STATE 

This is a temporary controller state. The test data 
register selected by the current instruction retains its 
previous state. If TMS is held LOW and a rising edge 
is applied to TCK when in this state, the controller 
moves into the Capture-DR state, and a scan se- 
quence for the selected test data register is initiated. 
If TMS is held HIGH and a rising edge is applied to 
TCK, the controller moves to the Se/ect-/R-Scan 
state. 



The instruction does not change in this state. 
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6.4.4 SELECT-IR-SCAN STATE 

This is a temporary controller state. The test data 
register selected by the current instruction retains its 
previous state. If TMS is held LOW and a rising edge 
is applied to TCK when in this state, the controller 
moves into the Capture-IR state, and a scan se- 
quence for the instruction register is initiated. If TMS 
is held HIGH and a rising edge is applied to TCK, the 
controller moves to the Test-Logic-Reset state. 

The instruction does not change in this state. 



6.4.8 PAUSE-DR STATE 

The pause state allows the test controller to tempo- 
rarily halt the shifting of data through the test data 
register in the serial path between TDI and TDO. 
This might be necessary, for example, to allow the 
tester to reload its pin memory from disk during ap- 
plication of a long test sequence. 

The test data register selected by the current in- 
struction retains its previous state. The instruction 
does not change in this state. 



6.4.5 CAPTURE-DR STATE 

In this state, the BSR captures input pin data if the 
current instruction is EXTEST or SAMPLE. The other 
test data registers, which do not have parallel input, 
are not changed. 

The instruction does not change in this state. 

When the TAP controller is in this state and a rising 
Nedge is applied to TCK, the controller enters the 
Exit1-DR state if TMS is HIGH or the Shift-DR state 
if TMS is LOW. 



The controller remains in this state as long as TMS 
is LOW. When TMS goes HIGH and a rising edge is 
applied to TCK, the controller moves to the Exit2-DR 
state. 



6.4.9 EXIT2-DR STATE 

This is a temporary state. If TMS is held HIGH and a 
rising edge is applied to TCK, the scanning process 
terminates, and the TAP controller enters the 
Update-DR state. If TMS is held LOW and a rising 
edge is applied to TCK, the, controller enters the 
Shift-DR state. 




6.4.6 SHIFT-DR STATE 

In this controller state, the test data register con- 
nected between TDI and TDO as a result of the cur- 
rent instruction shifts data one stage toward its serial 
output on each rising edge of TCK. 

The instruction does not change in this state. 

When the TAP controller is in this state and a rising 
edge is applied to TCK, the controller enters the 
Exit1-DR state if TMS is HIGH or remains in the 
Shift-DR state if TMS is LOW. 



6.4.7 EXIT1-DR STATE 

This is a temporary state. If TMS is held HIGH, a 
rising edge applied to TCK while in this state causes 
the controller to enter the Update-DR state, which 
terminates the scanning process. If TMS is held low 
and a rising edge is applied to TCK, the controller 
enters the Pause-DR state. 

The test data register selected by the current in- 
struction retains its' previous state unchanged. The 
instruction does not change in this state. 



The test data register selected by the current in- 
struction retains its previous state unchanged. The 
instruction does not change in this state. 

6.4.10 UPDATE-DR STATE 

The BSR register is provided with a latched parallel 
output to prevent changes at the parallel output 
while data is shifted in response to the EXTEST and 
SAMPLE instructions. When the TAP controller is in 
this state and the BSR register is selected, data is 
latched onto the parallel output of this register from 
the shift-register path on the falling edge of TCK. 
The data held at the latched parallel output does not 
change other than in this state. 

All shift-register stages in test data registers select- 
ed by the current instruction retain their previous 
state unchanged. The instruction does not change in 
this state. 

When the TAP controller is in this state and a rising 
edge is applied to TCK, the controller enters the 
Select-DR-Scan state if TMS is held HIGH or the 
Run-Test/ldle state if TMS is held LOW. 



6.4.11 CAPTURE-IR STATE 

In this controller state the shift register contained in 
the instruction register loads the fixed value 0001 on 
the rising edge of TCK. 
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The test data register selected by the current in- 
struction retains its previous state. The instruction 
does not change in this state. 

When the controller is in this state and a rising edge 
is applied to TCK, the controller enters the Exit1-IR 
state if TMS is held HIGH or the Shift-IR state if TMS 
is held LOW. 



6.4.12 SHIFT-IR STATE 

In this state, the shift register contained in the in- 
struction register is connected between TDI and 
TDO and shifts data one stage towards its serial out- 
put on each rising edge of TCK. 

The test data register selected by the current in- 
struction retains its previous state. The instruction 
does not change in this state. 

When the controller is in this state and a rising edge 
is applied to TCK, the controller enters the Exit1-IR 
state if TMS is held HIGH or remains in the Shift-IR 
state if TMS is held LOW. 



6.4.13 EXIT1-IR STATE 

This is a temporary state. If TMS is held HIGH, a 
rising edge applied to TCK while in this state causes 
the controller to enter the Update-IR state, which 
terminates the scanning process. If TMS is held low 
and a rising edge is applied to TCK, the controller 
enters the Pause-IR state. 

The test data register selected by the current in- 
struction retains its previous state unchanged. The 
instruction does not change in this state, and the 
instruction register retains its state. 



6.4.14 PAUSE-IR STATE 

This state allows the shifting of the instruction regis- 
ter to be temporarily halted. 

The test data register selected by the current in- 
struction retains its previous state. The instruction 
does not change in this state, and the instruction 
register retains its state. 

The controller remains in this state as long as TMS 
is LOW. When TMS goes HIGH and a rising edge is 
applied to TCK, the controller moves to the Exit2-IR 
state. 



6.4.15 EXIT2-IR STATE 

This is a temporary state. If TMS is held HIGH and a 
rising edge is applied to TCK, the scanning process 



terminates, and the TAP controller enters the 
Update-IR state. If TMS is held LOW and a rising 
edge is applied to TCK, the controller enters the 
Shift-IR state. 

The test data register selected by the current in- 
struction retains its previous state unchanged. The 
instruction does not change in this state, and the 
instruction register retains its state. 



6.4.16 UPDATE-IR STATE 

The instruction shifted into the instruction register is 
latched onto the parallel output from the shift-regis- 
ter path on the falling edge of TCK. Once the new 
instruction has been latched, it becomes the current 
instruction. 

Test data registers selected by the current instruc- 
tion retain the previous state. 



6.5 Boundary Scan Register Cell 
Ordering 

Figure 6.4 shows the order of cells in the BSR. 
There are 150 cells including TDO. TDI is not a BSR 
cell. 

The DCTL, ACTL, TCTL, and OCTL cells do not cor- 
respond to pins of the i860 XP microprocessor; rath- 
er, they control the bidirectional and tristate pins: 

DCTL D63-D0, DP7-DP0 

ACTL A31-A3 

TCTL Tristate outputs: ADS#, BE7#-BE0#, 
CACHE#, CTYP, D/C#, KBO, KB1, LEN, 
M/IO#, NENE#, PCD, PCYC, PWT, W/R# 

OCTL Outputs not floated in normal operation: 
BREQ, HIT#, HITM#, HLDA, LOCK#, 
PCHK# 

If a value of one is loaded into any of these control 
latches, the associated pins will not drive the exter- 
nal bus while running EXTEST. 

The values of DCTL, ACTL, TCTL, and OCTL are 
undefined during the SAMPLE instruction. 

The values and direction of I/O and outputs do not 
change during the scanning process (that is, during 
Shift-DR states). They only change after scanning is 
completed (in the Update-DR state). 

The decision table, Table 6.3, defines how the 
boundary scan instructions EXTEST and SAMPLE/ 
PRELOAD utilize BSR. 
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Figure 6.4. Boundary Scan Register Ordering 
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6.6 TAP Controller Initialization 

TAP can be initialized by applying a high signal level 
on the TMS input for five periods of TCK or by acti- 
vating the TRST# input pin. TCK does not have to 
be running in order to initialize TAP with the TRST# 
pin. TRST# is provided with an internal pull-up resis- 
tor; so, even if an open circuit fault occurs, the TAP 
logic can still be used. 



7.0 MECHANICAL DATA 

Figures 7.1 and 7.2 show the locations of pins; Ta- 
bles 7.1 and 7.2 help to locate pin identifiers. 



Table 6.3. Instruction Functions 



Instruction: 


EXTEST 


SAMPLE/PRELOAD 


Control Cell: 


LOW 


HIGH 


LOW 


HIGH 


Input BSR cells ... 


. . . sample values driven to 
processor by system 


. . sample values driven to 
processor by system 


Values of input cells 
used by processor? 


NO 


NO 


Output BSR cells .. . 


... drive output pins with 
cell values 


. . . sample values driven 
by processor 


Input/output BSR cells: 


Treat as 
output 


Treat as 
input 


Treat as 
output 


Treat as 
input 



2-94 





U 








































BRDY# 
O 


KEN* 
O 


NA# 
O 


WB/WT* 
O 


v cc 
o 


v cc 
o 


Vcc 
o 


Vcc 
o 


v cc v cc v cc 
O o 


v cc 
o 


v cc 
o 


v cc 
o 


v cc 
o 


D55 
O 


D51 
O 


D44 
O 


D40 
O 






T 


W/R# 



LEN 



PWT 
O 


PCYC 



v SS 




v ss 
o 


Vss 
o 


Vss 
o 


Vss v ss Vss 
O o 


Vss 
o 


v ss 




v S s 
o 


Vss 




D56 
O 


D49 
O 


D42 
O 


D39 





<5" 


S 


A3 
O 


RESET 



LOCK* 
O 


M/10* 



EADS# 

o 


INT/CS8 
O 


BERR 
O 


FLINE# 
O 


HLDA KB1 NENE* 
O O 


HIT* 

o 


TRST* 
O 


TDI 
O 


DS2 
O 


D58 



D46 
O 


D52 



D37 





c 

CD 


R 


A4 
O 


v ss 




v cc 

O 


BOFF* 
O 


D/C# 



PCD 
O 


INV 
O 


PEN* 



BREQ TDO KBO 
O O 


HOLD 
O 


TMS 



D63 
O 


D60 



D57 



v cc 
o 


D33 



D35 





— L 


Q 


TCK 
O 


v ss 
o 


v CC 
o 


CACHE* 
O 


AHOLD 

o 
















D61 
O 


D54 



v cc 
o 


Vss 




DP4 
O 




09 

o> 

O 
H 
S 


f 












N 


P 


VccCLK 
O 


Vcc 

o 


v ss 




RSRVD 



CTYP 

















D59 
O 


DP6 
O 


Vss 
o 


v cc 




D34 





X 


N 


Vcc 




v cc 
o 


Vss 
o 


ADS* 
O 


HITM# 

















DP7 



D50 



Vss 
o 


v cc 



D36 





icroprocessor Pin Confij 

2-95 


M 


Vcc 
o 


v S s 




v ss 
o 


CLK 



A5 

















D53 



D47 
O 


v S s 
o 


Vss 




D31 





L 
K 
J 


v cc 



v cc 

v cc 
o 


v cc 


v ss 



v cc 
o 


v ss 



v S s 



v S s 
o 


SPARE 
O 

A10 


A12 



A6 


A8 


A14 









PINOUT 
PIN SIDE VIEW 








D48 


D45 


DP5 



D41 


D43 


D38 
O 


Vss 
o 

Vss 
o 

v ss 
o 


Vcc 



Vss 



v cc 




Vcc 



Vcc 



Vcc 






H 


Vcc 

o 


Vss 
o 


v ss 
o 


A16 
O 


A20 
O 
















D32 
O 


PCHK# 
O 


Vss 
o 


Vss 




Vqc 





c 
3 


G 


v cc 
o 


Vcc 
o 


v S s 
o 


A22 
O 


A26 
O 
















D28 
O 


D30 
O 


v ss 
o 


Vcc 
o 


D29 
O 




o 

3 
I 


F 


A7 

o 


v cc 
o 


v S s 
o 


A28 
O 


A30 
O 
















D24 
O 


D26 
O 


Vss 
o 


Vcc 
o 


D27 
O 




k 


E 


A9 

o 


Vss 
o 


Vcc 
o 


A27 
O 


BE0# 
O 


^ 












J 


D21 
O 


D23 
O 


D25 

o 


Vss 
o 


Vcc 
o 




o 

3 

5' 
















D 


A11 
O 


Vss 
o 


Vcc 
o 


A29 
O 


BE1# 
O 


BE2# 
O 


BE6# 
O 


EWBE# 
O 


D1 D5 D10 
O O O 


D14 
O 


DP2 
O 


D17 
O 


D19 
O 


D20 
O 


Vcc 
o 


DP3 

o 


D22 





C 


A13 
O 


A19 

o 


A18 

o 


A31 
O 


BE4# 
O 


v ss 
o 


Vss 
O 


Vss 

O 


v ss v ss v ss 

O O O 


v ss 

O 


v ss 

O 


v ss 

O 


D12 
O 


D8 
O 


D7 
O 


D16 
O 


D18 
O 




CL 



B 


A15 
O 


A21 

o 


A24 
O 


BE3* 
O 


v ss 
o 


v cc 
o 


v cc 

O 


v ss 

.0 


v cc v ss v cc 
O o 


v ss 

O 


v cc 

O 


v cc 

O 


v ss 

O 


D9 
O 


D11 
O 


D13 
O 


D15 
O 






A 


A17 


A23 
O 


A25 



BE5# 



BE7# 



BYPASS* 
O 


DO 



v cc 



v cc Vqc v^ 
O o 


v C c 
o 


D2 



v cc 
o 


DPO 
O 


D3 
O 


D4 
O 


D6 
O 


DP1 







1 


2 


3 


4 


5 


6 


7 


8 


9 10 11 


12 


13 


14 


15 


16 


17 


18 


19 

240874-64 




-n 
<5" 

c 

jo 

09 

o 
X 



o 

•a 

o 
o 

CD 
CO 
0) 

o 



CO 



O 
o 

3 



o 



< 
5* 

t 
o 
3 

H 
O 

■o 

CO 

a 

(D 



D40 
O 


D44 
O 


D51 
O 


D55 
O 


v cc 

O 


Vcc 
o 


Vcc 
o 


v cc 
o 


v cc v cc v^ 
o o o 


Vcc 
o 


Vcc 
o 


Vcc v cc 
O O 


WB/WT# 
O 


NA# 
O 


KEN# 
O 


BRDY# 
O 


D39 
O 


D42 
O 


D49 



D56 
O 


Vss 

o 


v S s 
o 


Vss 
o 


Vss 




v ss v ss v ss 
o o 


Vss 



Vss 

o 


v ss Vss 

O 


PCYC 
O 


PWT 
O 


LEN 
O 


W/R# 
O 


D37 


D52 


D46 


D58 


D62 


TDI 


TRST# 


HIT# 


NENE# KB1 HLDA 


FLINE# 


BERR 


INT/CS8 EADS# 


M/I0# 


LOCK* 


RESET 


A3 


O 








O 


O 


O 


O 











O 


O 


O 





O 


O 


D35 


D33 


v cc 


D57 


D60 


D63 


TMS 


HOLD 


KBO TDO BREQ 


PEN# 


INV 


PCD D/C# 


B0FF# 


Vcc 


v ss 


A4 


O 


O 


O 


O 


O 





O 


O 


O 


O 


O 


O 


O 








O 


DP4 
O 


Vss 

O 


v cc 
o 


D54 
O 


D61 

o 
















AHOLD 


CACHE# 
O 


Vcc 
o 


v S s 
o 


TCK 
O 


r 












N 





D34 
O 


Vcc 



Vss 



DP6 
O 


D59 

















CTYP 



RSRVD 
O 


Vss 
o 


v cc 



VccCLK 
O 


D36 
O 


Vcc 

o 


Vss 
o 


D50 
O 


DP7 
O 
















HITM# 



ADS# 
O 


Vss 
o 


Vcc 

o 


V C c 

o 


D31 
O 


v ss 




v ss 
o 


D47 
O 


D53 

















A5 



CLK 
O 


Vss 




Vss 
o 


Vcc 
o 


Vcc 

O 

Vcc 
o 

Vcc 
o 


Vcc 

o 
Vss 



Vcc 



v ss 
o 

Vss 
o 

v ss 




D41 
O 

D43 
O 

D38 
O 


D48 
O 

D45 
O 

DP5 
O 








PINOUT 
TOP SIDE VIEW 








A6 


A8 
O 

A14 
O 


SPARE 
O 

A10 
O 

A12 
O 


Vss 
o 

Vss 


Vss 




Vcc 
o 

v S s 
o 

Vcc 
o 


v cc 
o 

v C c 
o 

Vcc 

o 


Vcc 

o 


Vss 
o 


Vss 
o 


PCHK# 
O 


D32 
O 
















A20 
O 


A16 
O 


Vss 
o 


Vss 
o 


V C c 

o 


D29 
O 


Vcc 

o 


Vss 
o 


D30 
O 


D28 
O 
















A26 
O 


A22 
O 


Vss 
o 


Vcc 
o 


Vcc 
o 


D27 
O 


Vcc 
o 


Vss 
o 


D26 
O 


D24 
O 
















A30 
O 


A28 
O 


Vss 
o 


Vcc 

o 


A7 

o 


Vcc 
o 


Vss 
o 


D25 

o 


D23 
O 


D21 
O 


K 












^ 


BE0# 


A27 
O 


Vcc 




v S s 
o 


A9 
O 
















O 


D22 


DP3 


Vcc 


D20 


D19 


D17 


DP2 


D14 


D10 D5 D1 


EWBE# 


BE6# 


BE2# BE1# 


A29 


Vcc 


Vss 


All 


O 


o 


o 


O 


O 


O 


O 


O 


O O O 


O 


O 


O O 


O 


o 


o 


O 


D18 
O 


D16 
O 


D7 

o 


D8 
O 


D12 
O 


v ss 

O 


Vss 
o 


v S s 




V S s v ss Vgs 
O 


v ss 




v ss 

O 


Vss BE4# 
O 


A31 
O 


A18 



A19 

o 


A13 
O 


D15 
O 


D13 
O 


D11 
O 


D9 
O 


V S s 
O 


Vcc 

o 


Vcc 
o 


Vss 
o 


v cc v ss v cc 
P o o 


V S s 

o 


v cc 
o 


v cc v ss 

O O 


BE3# 
O 


A24 
O 


A21 

o 


A15 
O 


DP1 


D6 


D4 


D3 


DPO 


v cc 


D2 


v cc 


v cc Vcc v C c 


Vcc 


DO 


BYPASS# BE7# 


BE5# 


A25 


A23 


A17 


O 


O 


O 


O 


O 


o 


O 


o 


o o o 


o 


O 


O O 


O 





O 


•/ 
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Table 7.1. Pin Cross Reference by Location 



Location Signal 

A01 A17 

A02 A23 

A03 A25 

A04 BE5# 

A05 BE7# 

A06 BYPASS# 

A07 DO 

A08 V CC 

A09 V CC 

A10 V CC 

A11 V C c 

A12 V CC 

A13 D2 

A14 V CC 

A15 DPO 

A16 D3 

A17 . ...D4 

A18 D6 

A19 DP1 

B01 A15 

B02.. A21 

B03 A24 

B04 BE3# 

B05.... V SS 

B06 V CC 

B07 V C c 

B08 V SS 

B09 V CC 

B10 V SS 

B11 V CC 

B12... V SS 

B13 Vcc 

B14 V C c 

B15. V SS 

B16 D9 

B17 D11 

B18..... D13 

B19 D15 

C01 A13 

C02 A19 

C03 A18 

C04 A31 

C05 BE4# 

C06.. V SS 

C07. V SS 

C08 V SS 

C09 V SS 

C10 V SS 

C11 V SS 

C12 V SS 

C13 V SS 

C14 V SS 



Location Signal 



Location Signal 



Location Signal 

N01 V CC 

N02. V CC 

N03 V SS 

N04.. ADS# 

N05.. HITM# 

N15 ....DP7 

N16 D50 

N17... V SS 

N18 V CC 

N19 D36 

P01 VccCLK 

P02. V CC 

P03 V SS 

P04 RSRVD 

P05 CTYP 

P15 D59 

P16 DP6 

P17. V SS 

P18..... V C c 

P19... D34 

Q01 TCK 

Q02 V SS 

Q03 V CC 

Q04 CACHE# 

Q05 AHOLD 

Q15 D61 

Q16 D54 

Q17.... V C c 

Q18 V SS 

Q19 DP4 

R01 . A4 

R02 V SS 

R03.. V C c 

R04 BOFF# 

R05.... D/C# 

R06 PCD 

R07 INV 

R08 PEN# 

R09.. BREQ 

R10 TDO 

R11 KBO 

R12.. HOLD 

R13 TMS 

R14.... D63 

R15.. D60 

R16.... D57 

R17 Vcc 

R18... :D33 

R19 .D35 

S01 A3 

S02 RESET 

S03 LOCK# 



C15 D12 

C16 D8 

C17 D7 

C18 D16 

C19 D18 

D01 A11 

D02 V SS 

D03 Vcc 

D04 A29 

D05.. BE1# 

D06. BE2# 

D07 BE6# 

D08 EWBE# 

D09 D1 

D10 .D5 

D11 ..D10 

D12 D14 

D13 DP2 

D14 D17 

D15... D19 

D16..... D20 

D17 V CC 

D18 DP3 

D19.. D22 

E01 A9 

E02 V SS 

E03.... Vcc 

E04. A27 

E05.... BE0# 

E15 D21 

E16 D23 

E17.... D25 

E18 ...V SS 

E19.... V CC 

F01 A7 

F02....: V CC 

F03.:. V SS 

F04 A28 

F05... A30 

F15... D24 

F16 D26 

F17.. V SS 

F18.... V CC 

F19..... D27 

G01 V C c 

G02 V CC 

G03... V SS 

G04 A22 

G05 ..A26 

G15 D28 

G16 D30 

G17 V SS 



G18 V C c 

G19 D29 

H01 V CC 

H02 V SS 

H03 V SS 

H04 A16 

H05 A20 

H15 D32 

H16 PCHK# 

H17 V SS 

H18 V SS 

H19 Vcc 

J01 V CC 

J02 V CC 

J03 V SS 

J04 A12 

J05 A14 

J15 DP5 

J16 D38 

J17 V SS 

J18... .Vcc 

J19... Vcc 

K01.. V C c 

K02 V SS 

K03... V SS 

K04.... A10 

K05.. A8 

K15.. .D45 

K16.v D43 

K17. Vss 

K18... V SS 

K19... Vcc 

L01 V CC 

L02 .-..■; Vcc 

L03 V SS 

L04 SPARE 

L05 A6 

L15... D48 

L16... D41 

L17... V SS 

L18 ;.....,:.... Vcc 

L19. V CC 

M01 Vcc 

M02 ..V S s 

M03 V S s 

M04 ...CLK 

M05... ...A5 

M15 D53 

M16 D47 

M17 V SS 

M18 V S s 

M19 D31 



2-97 



a a n 

in%a. 



J860TM XP MICROPROCESSOR 



PHIiyMOMAGW 



Table 7.1. Pin Cross Reference by Location (Continued) 



Location Signal 

S04 ,..M/IO# 

505 . .EADS# 

506 INT/CS8 

507 . BERR 

S08 FLINE# 

S09 HLDA 

S10 KB.1 

S1T NENE# 

S12.. HIT# 

S13 TRST# 

S14 TDI 

S15. .....D62 

S16 D58 

S17... ..D46 



Location 



Signal 



Location Signal 



Location Signal 

U08 .V CC 

U09 V CC 

U10... V CC 

U11 V CC 

U12, V CC 

U13 V CC 

U14 V CC 

U15 V CC 

U16 D55 

U17. D51 

U18 D44 

U19 D40 



S18 ..D52 

S19 .D37 

T01 W/R# 

T02.... LEN 

T03 . ...PWT 

T04 PCYC 

T05.... V SS 

T06 ...V SS 

T07... V SS 

T08 V SS 

T09 V SS 

T10 V SS 

T11 V SS 

T12.... V SS 



T13, 

T14 

T15, 

T16. 

T17. 

T18. 

T19. 



•Vss 
•Vss 
•Vss 
.D56 
.D49 
.D42 
.D39 



U01 BRDY# 

U02 KEN# 

U03... NA# 

U04 WB/WT# 

U05..... V CC 

U06.... V CC 

U07 V CC 







Table 7.2. 


Pin Cross R 


ef erence by 


Pin Name 






Signal 


Location 


Signal 


Location 


Signal 


Location 


Signal 


Location 


A3 


S01 


AHOLD 


Q05 


D13 


B18 


D43 


K16 


A4 


R01 


BE0# 


E05 


D14 


D12 


D44 


U18 


A5 


M05 


BE1# 


D05 


D15 


B19 


D45 


K15 


A6 


L05 


BE2# 


D06 


D16 


C18 


D46 


S17 


A7 


F01 


BE3# 


B04 


D17 


D14 


D47 


M16 


A8 


K05 


BE4# 


C05 


D18 


C19 


D48 


L15 


A9 


E01 


BE5# 


A04 


D19 


D15 


D49 


T17 


A10 


K04 


BE6# 


D07 


D20 


D16 


D50 


N16 


A11 


D01 


BE7# 


A05 


D21 


E15 


D51 


U17 


A12 


J04 


BERR 


S07 


D22 


D19 


D52 


S18 


A13 


C01 


BOFF# 


R04 


D23 


E16 


D53 


M1.5 


A14 


J05 


RSRVD 


P04 


D24 


F15 


D54 


Q16 


A15 


B01 


BRDY# 


U01 


D25 


E17 


D55 


U16 


A16 


H04 


BREQ 


R09 


D26 


F16 


D56 


T16 


A17 


A01 


CACHE # 


Q04 


D27 


F19 


D57 


R16 


A18 


C03 


CLK 


M04 


D28 


G15 


D58 


S16 


A19 


C02 


CTYP 


P05 


D29 


G19 


D59 


P15 


A20 


H05 


DO 


A07 


D30 


G16 


D60 


R15 


A21 


B02 


D1 


D09 


D31 


M19 


D61 


Q15 


A22 


G04 


D2 


A13 


D32 


H15 


D62 


S15 


A23 


A02 


D3 


A16 


D33 


R18 


D63 


R14 


A24 


B03 


D4 


A17 


D34 


P19 


D/C# 


R05 


A25 


A03 


D5 


D10 


D35 


R19 


DPO 


A15 


A26 


G05 


D6 


A18 


D36 


N19 


DP1 


A19 


A27 


E04 


D7 


C17 


D37 


S19 


DP2 


D13 


A28 


F04 


D8 


C16 


D38 


J16 


DP3 


D18 


A29 


D04 


D9 


B16 


D39 


T19 


DP4 


Q19 


A30 


F05 


D10 


D11 


D40 


U19 


DP5 


J15 


A31 


C04 


D11 


B17 


D41 


L16 


DP6 


P16 


ADS# 


N04 


D12 


C15 


D42 


T18 


DP7 


N15 
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Table 7.2. Pin Cross Reference by Pin Name (Continued) 




Signal 


Location 


Signal 


Location 


Signal 


Location 


Signal 


Location 


EADS# 


S05 


v C c 


B06 


Vcc 


R17 


Vss 


H17 


FLINE# 


S08 


Vcc 


B07 


Vcc 


U05 


Vss 


H18 


HIT# 


S12 


Vcc 


B09 


Vcc 


U06 


Vss 


J03 


HITM# 


N05 


Vcc 


B11 


Vcc 


U07 


v ss 


J17 


HLDA 


S09 


Vcc 


B13 


Vcc 


U08 


Vss 


K02 


HOLD 


R12 


Vcc 


B14 


Vcc 


U09 


Vss 


K03 


INT/CS8 


S06 


Vcc 


D03 


Vcc 


U10 


Vss 


K17 


INV 


R07 


Vcc 


D17 


Vcc 


U11 


Vss 


K18 


KBO 


R11 


Vcc 


E03 


Vcc 


U12 


Vss 


L03 


KB1 


S10 


Vcc 


E19 


Vcc 


.1113 


Vss 


L17 


KEN# 


U02 


Vcc 


F02 


Vcc 


U14 


Vss 


M02 


LEN 


T02 


Vcc 


F18 


Vcc 


U15 


Vss 


M03 


LOCK# 


S03 


Vcc 


G01 


V CC CLK 


P01 


Vss 


M17 


M/IO# 


S04 


Vcc 


G02 


Vss 


B05 


Vss 


M18 


NA# 


U03 


Vcc 


G18 


Vss 


B08 


Vss 


N03 


NENE# 


S11 


Vcc 


H01 


Vss 


B10 


Vss 


N17 


PCD 


R06 


Vcc 


H19 


Vss 


B12 


Vss 


P03 


PCHK# 


H16 


Vcc 


J01 


Vss 


B15 


Vss 


P17 


PCYC 


T04 


Vcc 


J02 


v ss 


C06 


Vss 


Q02 


PEN# 


R08 


Vcc 


J18 


Vss 


C07 


Vss , 


Q18 


PWT 


T03 


Vcc 


J19 


Vss 


C08 


Vss 


R02 


RESET 


S02 


Vcc 


K01 


Vss 


C09 


Vss 


T05 


SPARE 


L04 


"Vcc 


K19 


Vss 


C10 


Vss 


T06 


EWBE# 


D08 


Vcc 


L01 


Vss 


C11 


Vss 


T07 


BYPASS # 


A06 


Vcc 


L02 


Vss 


C12 


Vss 


T08 


TCK 


Q01 


Vcc 


L18 


Vss 


C13 


Vss 


T09 


TDI 


S14 


Vcc 


L19 


Vss 


C14 


Vss 


T10 


TDO 


R10 


Vcc 


M01 


Vss 


D02 


Vss 


T11 


TMS 


R13 


Vcc 


N01 


Vss 


E02 


Vss 


T12 


TRST# 


S13 


Vcc 


N02 


Vss 


E18 


Vss 


T13 


Vcc 


A08 


Vcc 


N18 


Vss 


F03 


Vss 


T14 


Vcc 


A09 


Vcc 


P02 


Vss 


F17 


Vss 


T15 


Vcc 


A10 


Vcc 


P18 


Vss 


G03 


W/R# 


T01 


Vcc 


A11 


Vcc 


Q03 


Vss 


G17 


WB/WT# 


U04 


Vcc 


A12 


Vcc 


Q17 


Vss 


H02 






Vcc 


A14 


Vcc 


R03 


Vss 


H03 
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Table 7.3. Ceramic PGA Package Dimension Symbols 


Letter or 
Symbol 


Description of Dimensions 


A 


Distance from seating plane to highest point of body 


Ai 


Distance between seating plane and base plane (lid) 


A 2 


Distance from base plane to highest point of body 


A 3 


Distance from seating plane to bottom of body 


B 


Diameter of terminal lead pin 


D 


Largest overall package dimension of length 


Di 


A body length dimension, outer lead center to outer lead center 


®1 


Linear spacing between true lead position centerlines 


L 


Distance from seating plane to end of lead 


Si 


Other body dimension, outer lead center to edge of body 



NOTES: 

1. Controlling dimension: millimeter. 

2. Dimension "e-i" ("e") is noncumulative. 

3. Seating plane (standoff) is defined by P.p. board hole size: 0.0415-0.0430 inch. 

4. Dimensions "B", "B-i", and "C" are nominal. 

5. Details of Pin 1 identifier are optional. 
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7REF. 



1.52' 

45° CHAMFER 

(INDEX CORNER) 




SEATING ■ 
PLANE 



0B (ALL PINS) 



T 



SWAGGED 

PIN 

DETAIL 



•a 



BASE PLANE 



Family: Ceramic Pin Grid Array Package 


Symbol 


Millimeters 


Inches 


Min 


Max 


Notes 


Min 


Max 


Notes 


A 


3.56 


4.57 




.140 


.180 




A1 


0.64 


1.14 


Solid Lid 


.025 


.045 


Solid Lid 


A2 


2.79 


3.56 


Solid Lid 


.110 


.140 


Solid Lid 


A3 


1.14 


1.40 




.045 


.055 




B 


0.43 


0.51 




.017 


.020 




D 


49.28 


49.96 




1.940 


1.967 




D1 


45.59 


45.85 




1.795 


1.805 




e1 


2.29 


2.79 




.090 


.110 




L 


2.54 


3.30 




.100 


.130 




N 


240 


280 




240 


280 




S1 


1.52 


2.54 




.060 


.100 




ISSUE 


9/90 





Figure 7.3. 262-Lead Ceramic PGA Package Dimensions 
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8.0 PACKAGE THERMAL 
SPECIFICATIONS 

For this section, let: 
P = maximum power consumption 

Tc = case temperature 

Ta = ambient air temperature 

0CA = thermal resistance from case to ambient air 

0jc = thermal resistance from junction to case 

0ja = thermal resistance from junction to ambient 
air 

The i860 XP microprocessor is specified for opera- 
tion when Tc is within the range of 0°C-85°C. Tc may 
be measured in any environment to determine 
whether the i860 XP microprocessor is within speci- 
fied operating range. The case temperature should 
be measured at the center of the top surface oppo- 
site the pins. 

Ta can be calculated from 0ca witn tne following 
equation: 

T A = Tc- P*#ca 



Typical values for #ca at various airflows and for 0jc 
are given in Table 8.1 for the 1.95 sq. in., 262 pin, 
ceramic PGA. 0jc is shown so that 0ja can be cal- 
culated by: 

0ja = 0jc-0ca 

Note that 0jc with a heatsink differs from 0jc with- 
out a heatsink because case temperature is mea- 
sured differently. Case temperature for 0jc with 
heatsink is measured at the center of the heat fin 
base. Case temperature for 0jc without heatsink is 
measured at the center of the package top surface. 

Table 8.2 shows the maximum Ta allowable (without 
exceeding Tc) at various airflows. 

Note that Ta is greatly improved by attaching "fins" 
or a "heat sink" to the package. P (the maximum 
power consumption) is calculated by using the maxi- 
mum Ice at 5 V as tabulated in the D.C. Characteris- 
tics of section 9. 

Figure 8.1 gives typical Ice derating with case tem- 
perature. For more information on heat sinks, mea- 
surement techniques, or package characteristics, re- 
fer to Intel Packaging Handbook, order number 
240800, 



Table 8.1. Thermal Resistance— In °C/Watt 





0JC 


0ca a s a Function of Airflow — ft/min (m/sec) 





(0) 


200 
(1.01) 


400 
(2.03) 


600 
(3.04) 


800 
(4.06) 


1000 
(5.07) 


With Heat Sink* 


1.6 


10.1 


6.3 


4.3 


3.2 


2.5 


2.2 


Without Heat Sink 


1.0 


13.5 


11.0 


8.0 


6.5 


5.5 


5.0 



NOTE: 

* Nine-fin, unidirectional heat sink (fin dimensions: 0.250" height, 0.040" fin width, 0.100" center-to-center spacing, 1.730" 
length) 



Table 8.2. Maximum Ta at Various Airflows— In °C 







Airflow — ft/min (m/sec) 




fCLK 
(MHz) 



(0) 


200 
(1.01) 


400 
(2.03) 


600 
(3.04) 


800 
(4.06) 


1000 
(5.07) 


Ta with 
HeatSink* 


50 


24 


47 


59 


66 


70 


72 


Ta without 
Heat Sink 


50 


4 


19 


37 


46 


52 


55 


Ta with 
Heat Sink* 


40 


34.5 


53.5 


63.5 


69 


72.5 


74 


Ta without 
HeatSink 


40 


17.5 


30 


45 


52.5 


57.5 


60 



NOTE: 

* Nine-fin, unidirectional heat sink (fin dimensions: 0.250" height, 0.040" fin width, 0.100" center-to-center spacing, 1.730" 
length) 
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1.30 



1.10 



1.00 



0.80 



— I 


I 1 


i 1 


1 1 


1 1 


h- 1 


1 1 


50MHz 

40MHz 

I 1 


1 1 


\— i — 



30 



50 



TEMPERATURE (Degrees Centigrade) 




Figure 8.1. Ice Crating with Case Temperature 

9.0 ELECTRICAL DATA 

All input and output timings are specified relative to 
the 1.5V level of the rising edge of CLK and refer to 
the point that the signal reaches 1 .5V. 
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9.1 Absolute Maximum Ratings 

Case Temperature Tc under Bias 0°C to 85°C 

Storage Temperature -65°C to + 1 50°C 

Voltage on Any Pin with 

Respect to Ground -0.5 to V C c + 0.5V 



NOTICE: This data sheet contains preliminary infor- 
mation on new products in production. The specifica- 
tions are subject to change without notice. Verify with 
your local Intel Sales office that you have the latest 
data sheet before finalizing a design. 



* WARNING: Stressing the device beyond the "Absolute 
Maximum Ratings" may cause permanent damage. 
These are stress ratings only. Operation beyond the 
"Operating Conditions" is not recommended and ex- 
tended exposure beyond the "Operating Conditions" 
may affect device reliability. 



9.2 D.C. Characteristics 

Table 9.1. D.C. Characteristics Operating Conditions: Vcc = 5V ±5%; Tc = 0°C to 85°C 



Symbol 


Parameter 


Min 


Max 


Units 


Notes 


V|L 


Input LOW voltage (TTL) 


-0.3 


+ 0.8 


V 




V| H 


Input HIGH voltage (TTL) 


2.0 


V C C + 0.3 


. V 




V|HC 


CLK Input HIGH (TTL) 


2.5 


V C C + 0.3 


V 




Vol 


Output LOW voltage (TTL) 




0.45 


V 


1 


Voh 


Output HIGH voltage (TTL) 


2.4 




V 


2 


ice 


Power supply current (@ 50 MHz) 




1.2 


Amp 


3 


ice 


Power supply current (@40 MHz) 




1.0 


Amp 


3 


Ili 


Input leakage current 




±15 


jllA 


4 


Ilip 


Input leakage current (pull-up) 




-400 


jllA 


5 


Ilo 


Output leakage current 




±15 


julA 


6 


C|N 


Input capacitance 




11.5 


PF 


7 


Co 


I/O or output capacitance 




14 


PF 


7 



NOTES: 

1 . This parameter is measured with current load of 5 mA. 

2. This parameter is measured with current load of 1 mA. Typical value is Vcc ~~ 0.45V. 

3. Measured at 50 MHz and Vcc = 5V. • 

4. This parameter is for inputs without pullups. Vcc is on, and 0V <> Vin ^ Vcc- 

5. This parameter is for inputs with pullups and Vn_ = 0.45V. Note that if the pull-ups are put in high-impedance state via the 
DCTL boundary scan cell that also tri-states the data outputs, then the leakage is ±15 juA. 

6. 0.45V <: V| N ^ V C c ~ 0-45V. 

7. These parameters are not tested; they are guaranteed by design characterization. 
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9.3 A.C. Characteristics 





Table 9.2. A.C. Characteristics 

C L = pF Unless Otherwise Specified; V C c = 5V ±5%; T c = 


= 0°C to 85°C 




Symbol 


Parameter 


Fig 


40 MHz 


50 MHz 


Notes 


Min 
(ns) 


Max 
(ns) 


Min 
(ns) 


Max 
(ns) 


tc 


CLK Period 


9.1 


25 


40 


20 


40 




ttc 


TCK Period 


9.2 


40 


1000 


40 


1000 






CLK Stability 


9.1 




0.1% 




0.1 % 




tch 


CLK High Time 


9.1 


7 




7 






tcl 


CLK Low Time 


9.1 


7 




7 






tr 


CLK Rise Time 


9.1 




3 




3 


h " 


tf 


CLK Fall Time 


9.1 




3 




3 


h 


ts 


TCK to CLK Skew 


9.3 




±1 




±1 


i 


ttch 


TCK High Time 


9.2 


10 




10 






ttcl 


TCK Low Time 


9.2 


10 




io 






ttcr 


TCK Rise Time 


9.2 




4 




4 




ttcf 


TCK Fall Time 


9.2 




4 




4 




tsu.1 


RESET, HOLD, BERR, FLINE#, 
PEN#, INT/CS8 Setup Time 


9.1 


8 




7 






tsu.2 


BOFF#, AHOLD, KEN#, NA#, 
INV,WB/WT# Setup Time 


9.1 


8 




7 






tsu.3 


EADS# Setup Time 


9.1 


9 




8 






tsu.4 


EWBE# Setup Time 


9.1 


8.5 




7.5 






tsu.5 


BRDY# Setup Time 


9.1 


8.5 




7.5 






tsu.6 


D63-D0, DP7-DP0 Setup Time 


9.1 


8.5 




7.5 






tsu.7 


D63-D0, DP7-DP0 Setup Time 
(Late Backoff Mode) 


9.1 


5.5 




4.5 






tsu.8 


A31 -A5 Setup Time 


9.1 


11 




10 






ttsu 


TDI, TMS, TRST# Setup Time 


9.2 


8 




8 






tth 


TDI,TMS,TRST# Hold Time 


9.2 


2 




1 




b 


th.1 


Hold Time, All Inputs 
except D63-D0, DP7-D0 


9.1 


2 




1 




c 


th.2 


D63-D0, DP7-DP0 Hold Time 
(Normal and Late Back-Off Mode) 


9.1 


3 




2 




c 


ttco 


TDO Valid Delay and All Outputs 
Valid Delay in EXTEST Mode 


9.2 


1.5 


17.5 


1.5 


16.5 


a,f 


tco.1 


A31-A22 Valid Delay 


9.1 


1.5 


12 


1.5 


11 


a 


tco.2a 


A21 -A3 Valid Delay 
(High Current Mode) 


9.1 


1.5 


11.5 


1.5 


10.5 


a,g 


tco.2b 


A21 -A3 Valid Delay 
(Normal Current Mode) 


9.1 


1.5 


12 


1.5 


11 


a 
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Table 9.2. A.C. Characteristics (Continued) 
C L = pF Unless Otherwise Specified; V C c = 5V ± 5%; T c = 0°C to 85°C 


Symbol 


Parameter 


Fig 


40 MHz 


50 MHz 


Notes 


Min 
(ns) 


Max 
(ns) 


Min 
(ns) 


Max 
(ns) 


tco.3 


D63-D0, DP7-DP0. Valid Delay 


9.1 


2.5 


14 


2.5 


13 


a,d 


tco.4 


BREQ,HLDA,PCHK#, 
NENE#, KBO, KB1 Valid Delay 


9.1 


1.5 


13 


1.5 


12 


a 


tco.5a 


ADS# Valid Delay 
(High Current Mode) 


9.1 


1.5 


10 


1.5 


9 


a,g 


tco.5b 


ADS# Valid Delay 
(Normal Current Mode) 


9.1 


1.5 


11 


1.5 


10 


a 


tco.6a 


W/R# Valid Delay 
(High Current Mode) 


9.1 


1.5 


11 


1.5 


10 


a,g 


tco.6b 


W/R# Valid Delay 
(Normal Current Mode) 


9.1 


1.5 


12 


1.5 


11 


a 


tco.7a 


HITM# Valid Delay 
(High Current Mode) 


9.1 


1.5 


12 


1.5 


11 


a, g 


tco.7b 


HITM# Valid Delay 
(Normal Current Mode) 


9.1 


1.5 


13 


1.5 


12 


a 


tco.8 


PWT, PCD, HIT#, CTYP, D/C'# M/IO#, 
PCYC, LOCK#, CACHE #, LEN Valid Delay 


9.1 


1.5 


12 


1.5 


11 


a 


tco.9a 


BE0#-BE7# Valid Delay 
(High Current Mode) 


9.1 


1.5 


12 


1.5 


11 


a,g 


tco.9b 


BE0#-BE7# Valid Delay 
(Normal Current Mode) 


9.1 


1.5 


13 


1.5 


12 


a 


tz.1 


Float Time All Outputs 
except D63-D0, DP7-DP0 


9.1 


2 


19 


2 


18 


e 


tz.2 


Float Time D63-D0, DP7-DP0 


9.1 


3 


19 


3 


18 


e 


tzt 


Float Time during Boundary Scan EXTEST 


9.1 




20 




20 


f 



NOTES: 

a. Minimum and maximum delays are for OpF load. 

b. These hold times are referenced to the falling edge of TCK. 

c. These hold times are referenced to the rising edge of CLK. 

d. Output delay for D63-D0, DP7-DP0 is from the CLK after ADS# activation. 

e. Float time = delay until maximum output current is less than + l[_o- Float time is not tested. 

f. Delay from falling edge of TCK. 

g. These pins can be configured as normal or high-current buffers. When they are configured as high-current buffers for 
interface with cache memory or other large loads, use the derating curves in Figure 9.3. Otherwise, all normal buffers use 
the derating curves in Figure 9.4. 

h. tr and tf should be measured between 0.8V and 2.5V. 
i. Assumes TCK and CLK both at 25 MHz. 
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Figure 9.1. CLK, Input, and Output Timings 
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Figure 9.2. TAP Signal Timings 
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ADS#, A21-A3, BE7#-BE0#, W/R#, HITM# (In High-Current Mode) 
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NOTES: 

Graphs are not linear outside the Cl range shown. 
NOMINAL = OpF value given in the A.C. Timings table. 
* Typical part under worst-case conditions. 



Figure 9.3. Typical Output Delay vs Load Capacitance 
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NOTES: 
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Figure 9.4. Typical Output Delay vs Load Capacitance 
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LOAD CAPACITANCE, C[_ (pF) 
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NOTES: 

Graphs are not linear outside the C|_ range shown. 
Typical part under worst-case conditions. 



Figure 9.5a. Typical Slew Time vs Load Capacitance under Worst-Case Conditions (Rising Voltage) 
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NOTES: 

Graphs are not linear outside the C[_ range shown. 
Typical part under worst-case conditions. 



Figure 9.5b. Typical Slew Time vs Load Capacitance under Worst-Case Conditions (Falling Voltage) 
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NOTES: 

Graph is not linear outside the frequency range shown. 
*Worst-case supply current at 5V. 
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Figure 9.6. Typical Ice vs. Frequency 
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9.4 Component Buffer Model 

9.4.1 FIRST ORDER ELECTRICAL BUFFER 
MODEL 

The first order electrical buffer model provides an 
accurate and simple representation of the buffers 
used in the inputs and outputs of the CHMOS i860 
XP CPU. The model output consists of four compo- 
nents: 

1 . Linear voltage waveform (dV/dt) 

2. Intrinsic buffer delay due to Cl (t ) 

3. Buffer output impedance (Ro) 

4. Buffer output capacitance (Co) 

as shown in Figure 9.7a 

A fitting algorithm has been used to arrive at values 
for dV/dt, t 0> Co, and Ro such that Ro matches the 
actual buffer impedance and Co, the intrinsic buffer 
output capacitance whether the output is on or off, 
remains constant across the operating range while 
minimizing the difference between the full buffer cir- 
cuit and its simplified electrical model for a set of 
different loads (lumped capacitance, and short and 
long transmission lines). dV/dT is the slope of the 
voltage ramp, while t is the intrinsic buffer delay 
associated with a given C|_. t accounts for the intrin- 
sic delay by offsetting the excitation of the model by 
the amount of the delay. 

NOTE: 

t is zero for Cl = and when the load is repre- 
sented by a transmission line. 





R o 


vn 


dv/dt u(t-t ) m 
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Figure 9.7a. Output Model 

The input model consists of one component, buffer 
capacitance (Cin), as shown in Figure 9.7b. 
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Figure 9.7b. Input Model 

9.4.2 FIRST ORDER ELECTRICAL MODEL 
PARAMETER VALUES 

The parameters that make up the first order electri- 
cal model vary with the buffer design. In addition, 
these parameters also vary with the operating condi- 
tion (i.e., temperature and Vcc) of the buffer pro- 
cess. The typical process corner is being modeled. 
Two sizes of buffer are used on these components, 
labelled here as small and large. The parameter val- 
ues found in Table 9.3 and 9.4 list dV/dt, t , Rq, and 
Co- These parameters are provided for both low-to- 
high and high-to-low transitions at the typical pro- 
cess corner for three operating conditions (Vcc = 
5.5V and Tj = - 1 0°C, V C c = 5.0V and Tj = 80°C, 
and Vcc ' = 4.5V and Tj = 125°C. 

9.4.3 PACKAGE PARAMETERS 

In addition to the buffer characteristics, package 
characteristics are also included to complete the 
model. Package inductance, capacitance and resist- 
ance values vary with design geometry and material 
properties of the package. Figure 9.8 shows a model 
of the package including these parameters and 
should be placed between the first order electrical 
buffer model as shown in Figure 9.9 and the board 
interconnects. Notice the package model only in- 
cludes the package inductance (Lp) and capaci- 
tance (Cp). This is sufficient since the package re- 
sistance is so small it is negligible. 

Table 9.5 lists the buffer model parameters for each 
pin of the i860 XP microprocessor. The table gives 
the package model parameters for each pin, fol- 
lowed by the input capacitance (input and I/O pins) 
and/or output buffer size (outputs and I/O). In those 
cases where the buffer used by a pin is an option 
selected at reset by the PEN # input, the output buff- 
er column lists the sizes available. Large buffers cor- 
respond to high-current mode, while small buffers 
correspond to normal current mode. 
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Figure 9.8. Package Model 



9.4.4 BOARD INTERCONNECTS 

The board interconnect can be considered as a 
lumped parameter (capacitive load) or as a transmis- 
sion line. As a rule of thumb, an unterminated board 
interconnect may be considered as a capacitive load 
if the round trip time (time for signal to travel from 
one end of the interconnect to the other and back) is 
short compared to the transition time of the signal. 
At frequencies of 50 MHz and above most intercon- 
nects behave as transmission lines (Figure 9.10). 
For accurate results at high frequencies, these 
transmission line effects must be taken into account 
and modeled. 
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Figure 9.9a. Output Buffer and Package Model 
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Figure 9.9b. Input Buffer and Package Model 



Figure 9.10. Transmission Line Model 
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Table 9.3. Small Output Buffer First Order Electrical Model Parameter Values 


Transition 


Vcc 


Tj 

(C) 


Ro 
(ohms) 


Co 
(PF) 


dV/dT 


t (ns) at various Cl 



(PF) 


5 
(PF) 


25 
(PF) 


50 
(PF) 


100 
(PF) 


150 
(PF) 


Low-to-High 


5.5 


-10 


28.0 


4.3 


5.5/1.2 





0.0 


0.1 


0.3 


0.7 


1.1 


Low-to- High 


5.5 


80 


36.4 


4.3 


5.5/1.4 





0.0 


0.1 


0.8 


0.8 


1.2 


Low-to-High 


5.5 


125 


40.4 


4.3 


5.5/1.5 





0.0 


0.1 


0.4 


0.8 


1.2 


Low-to-High 


5.0 


-10 


30.2 


4.3 


5.0/1.2 





0.0 


0.1 


0.4 


0.8 


1.2 


Low-to-High 


5.0 


80 


39.2 


4.3 


5.0/1.4 





0.0 


0.2 


0.4 


0.9 


1.3 


Low-to-High 


5.0 


125 


43.5 . 


4.3 


5.0/1.6 





0.0 


0.2 


0.4 


0.9 


1.3 


Low-to-High 


4.5 


-10 


33.0 


4.3 


4.5/1.2 





0.0 


0.2 


0.5 


1.0 


1.4 


Low-to-High 


4.5 


80 


42.8 


4.3 


4.5/1.6 





0.0 


0.2 


0.5 


1.0 


1.5 


Low-to-High 


4.5 


125 


47.4 


4.3 


4.5/1.6 





0.0 


0.3 


0.6 


1.1 


1.6 


High-to-Low 


5.5 


-10 


23.2 


4.3 


5.5/1.0 





0.0 


0.4 


0.7 


1.2 


1.6 


High-to-Low 


5.5 


80 


31.4 


4.3 


5.5/1.4 





0.0 


0.4 


0.9 


1.3 


1.8 


High-to-Low 


5.5 


125 


36.1 


4.3 


5.5/1.6 





0.0 


0.5 


0.8 


1.3 


1.8 


High-to-Low 


5.0 


-10 


24.0 


4.3 


5.0/1.1 





0.0 


0.5 


0.9 


1.2 


1.7 


High-to-Low 


5.0 


80 


32.8 


4.3 


5.0/1.4 





0.0 


0.5 


0.9 


1.5 


1.9 


High-to-Low 


5.0 


125 


37.8 


4.3 


5.0/1.7 





0.0 


0.5 


0.9 


1.4 


1.8 


High-to-Low 


4.5 


-10 


25.1 


4.3 


4.5/1.2 





0.0 


0.4 


0.7 


1.2 


1.7 


High-to-Low 


4.5 


80 


34.5 


4.3 


4.5/1.6 





0.0 


0.4 


0.8 


1.3 


1.8 


High-to-Low 


4.5 


125 


39.9 


4.3 


4.5/1.8 





0.0 


0.5 


0.9 


1.4 


1.9 
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Table 9.4. Large Output Buffer First Order Electrical Model Parameter Values 


Transition 


v C c 


Tj 
(P) 


Ro 
(ohms) 


Co 
(PF) 


dV/dT 


t (ns) at various C|_ 



(PF) 


5 
(PF) 


25 
(PF) 


50 
(PF) 


100 
(PF) 


150 
(PF) 


200 
(PF) 


250 
(PF) 


300 
(PF) 


Low-to-HIgh 


5.5 


-10 


12.1 


4.3 


5.5/0.7 





0.0 


0.1 


0.3 


0.6 


0.8 


1.0; 


1.3 


1.5 


Low-to-High 


5.5 


80 


15.5 


4.3 


5.5/0.9 





0.0 


0.2 


0.3 


0.6 


0.9 


1.1 


1.4 


1.7 


Low-to- High 


5.5 


125 


17.2 


4.3 


5.5/1,1 





0.0 


0.2 


0.4 


0.7 


1.0 


1.2 


1.4 


■1.7 


Low-to-High 


5.0 


-10 


13.0 


4.3 


5.0/0.9 





0.0 


0.1 


0.3 


0.6 


0.9 


1.1 


1.4 


1.7 


Low-to-High 


5.0 


80 


16.7 


4.3 


5.0/1.0 





0.0 


0.2 


0.4 


0.8 


1.1 


1.4 


1.7 


2.0 


Low-to-High 


5.0 


125 


18.5 


4.3 


5.0/1.2 





0.0 


0.2 


0.4 


0.8 


1.1 


1.4 


■1.7 


2.0 


Low-to-High 


4.5 


-10 


14.1 


4.3 


4.5/0.9 





0.0 


0.2 


0.4 


0.7 


1.1 


1,4 


1.7 


2.0 


Low-to-High 


4.5 


80 


18.0 


4.3 


4.5/1.2 





0.0 


0.2 


0.4 


0.9 


1.2 


1.5 


1.9 


2.2 


Low-to-High 


4.5 


125 


19.9 


4.3 


4.5/1.3 





0.0 


0.2 


0.5 


0.8 


1.2 


1.5 


1.9 


2.2 


High-to-Low 


5.5 


-10 


10.6 


4.3 


5.5/0.7 





0.0 


0.3 


0.6 


0.9 


1.2 


1.5 


1.8 


2.0 


Higtvto-Low 


5.5 


80 


13.9 


4.3 


5.5/1.0 





0.0 


0.4 


0.7 


1.2 


1.5 


1.9 


2.2 


2.5 


High-to-Low 


5.5 


125 


15.8 


4.3 


5.5/1.1 





0.0 


0.4 


0.8 


1.3 


1.7 


2.0 


2.4 


2.8 


High-to-Low 


5.0 


-10 


11.0 


4.3 


5.0/0.8 





0.0 


0.4 


0.7 


1.0 


1.3 


1.6 


1.9 


2.1 


High-to-Low 


5.0 


80 


14.5 


4.3 


5.0/1.0 





0.0 


0.4 


0.8 


1.2 


1.6 


2.0 


2.3 


2.6 


High-to-Low 


5.0 


125 


16.5 


4.3 


5.0/1.2 





0.0 


0.4 


0.8 


1.3 


1.7 


2.1 . 


2.5 


2.8 


High-to-Low 


4.5 


-10 


11.3 


4.3 


4.5/0.9 





0.0 


0.4 


0.7 


1.1 


1.4 


1.7 


2.0 


2.4 


High-to-Low 


4.5 


80 


15.2 


4.3 


4.5/1.2 





0.0 


0.4 


0.8 


1.3 


1.6 


2.0 


2.3 


2.7 


High-to-Low 


4.5 


125 


17.4 


4.3 


4.5/1.3 





0.0 


0.4 


0.8 


1.3 


1.7. 


2.1 


2.5 


2.8 
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Table 9.5 Buffer Models 






Pin Name 


Location 


Cp(pF) 
Typical 


Lp(nH) 
Typical 


Input 
Buffer 

Cin (PF) 
Typical 


Output 

Buffer 

Size 

(Large or Small) 


A3 


S01 


7.6 


13.8 


6.7 


L/S 


A 4 


R01 


6.2 


14.5 


6.7 


L/S 


A 5 


M05 


6.5 


7.8 


6.7 


L/S 


A 6 


L05 


5.3 


8.0 


6.7 


L/S 


A7 


F01 


7.7 


16.2 


6.7 


L/S 


A 8 


K05 


5.1 


7.7 


6.7 


L/S 


Ag 


E01 


8.0 


16.4 


6.7 


L/S 


A10 


K04 


5.1 


8.8 


6.7 


L/S 


A11 


D01 


8.3 


16.8 


6.7 


L/S 


'A12 


J04 


5.2 


9.0 


6.7 


L/S 


A13 


C01 


8.7 


17.2 


6.7 


L/S 


A14 


J05 


5.2 


7.8 


6.7 


L/S 


A15 


B01 


9.0 


17.8 


6.7 


L/S 


A16 


H04 


5.2 


9.0 


6.7 


L/S 


A17 


A01 


9.4 


18.2 


6.7 


L/S 


A18 


C03 


7.8 


14.5 


6.7 


L/S 


A19 


C02 


9.0 


15.3 


6.7 


L/S 


A20 


H05 


7.5 


7.7 


6.7 


L/S 


A21 


B02 


8.5 


15.7 


6.7 


L/S 


A22 


G04 


7.5 


9.1 


4.4 


S 


A 2 3 


A02 


8.1 


15.7 


4.4 


S 


A 24 


B03 


7.0 


14.5 


4.4 


S 


A 2 5 


A03 


7.7 


14.6 


4.4 


S 


A 2 6 


G05 


6.7 


7.9 


4.4 


S 


A 27 


E04 


7.6 


9.6 


4.4 


S 


A 2 8 


F04 


6.5 


9.2 


4.4 


S 


A 29 


D04 


7.4 


10.0 


4.4 


S 


A30 


F05 


5.9 


8.2 


4.4 


S 


A31 


C04 


6.6 


10.4 


4^.4 


S 


ADS# 


N04 


6.2 


9.1 




L/S 


AHOLD 


Q05 


6.0 


8.8 


2.0 




BE0# 


E05 


5.7 


8.8 




L/S 


BE1# 


D05 


6.7 


8.8 




L/S 


BE2# 


D06 


5.7 


9.0 




L/S 
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Table 9.5. Buffer Models (Continued) 



Pin Name 


Location 


Cp(pF) 
Typical 


Lp(nH) 
Typical 


Input 

Buffer 

C,n(pF) 

Typical 


Output 

Buffer 

Size 

(Large or Small) 


BE3# 


B04 


6.5 


11.2 




L/S 


BE4# 


C05 


5.9 


10.6 




L/S 


BE5# 


A04 


6.5 


12.0 




L/S 


BE6# 


D07 


4.9 


8.6 




L/S 


BE7# 


A05 


6.1 


11.5 




L/S 


BERR 


S07 


5.8 


8.7 


2.0 




BOFF# 


R04 


6.3 


10.4 


2.0 




RSRVD 


P04 


6.4 


9.4 


2.0 




BRDY# 


U01 


8.0 


14.7 


2.0 




BREQ 


R09 


4.4 


7.5 




S 


BYPASS # 


A06 


Strapping Option 


CACHE# 


Q04 


6.6 


9.8 




S 


CLK 


M04 


6.2 


8.9 


2.0 




CTYP 


P05 


6.5 


8.6 




S 


Do 


A07 


5.5 


10.6 


4.4 


S 


D1 


D09 


7.6 


7.6 


4.4 


S 


D 2 


A13 


7.4 


15.0 


4.4 


S 


D 3 


A16 


7.7 


17.7 


4.4 


S 


. D 4 


A17 


9.2 


17.9 


4.4 


S 


D 5 


D10 


7.5 


7.6 


4.4 


S 


D 6 


A18 


9.4 


18.3 


4.4 


S 


D 7 


C17 


8.6 


15.9 


4.4 


S 


D 8 


C16 


8.6 


14.5 


4.4 


S 


D 9 


B16 


9.3 


14.7 


4.4 


S 


D10 


D11 


8.3 


7.5 


4.4 


S 


D11 


B17 


8.9 


14.7 


4.4 


S 


D12 


C15 


8.1 


7.8 


4.4 


S 


D13 


B18 


8.6 


15.4 


4.4 


S 


D 14 


D12 


7.2 


7.8 


4.4 


S 


D15 


B19 


8.2 


15.6 


4.4 


S 


Die 


C18 


7.9 


10.7 


4.4 


s 


D17 


D14 


6;7 


9.2 


4.4 


S 


D18 


C19 


7.6 


14.2 


4.4 


s 


D19 


D15 


6.4 


10.0 


4.4 


s 
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Table 9.5 Buffer Models (Continued) 



Pin Name 


Location 


Cp(pF) 
Typical 


Lp(nH) 
Typical 


Input 
Buffer 
Cin (PF) 
Typical 


Output 

Buffer 

Size 

(Large or Small) 


D20 


D16 


7.4 


10.7 


4.4 


S 


D 2 i 


E15 


5.6 


8.8 


4.4 


S 


D22 


D19 


6.7 


12.7 


4.4 


s 


D23 


E16 


5.5 


9.7 


4.4 


s 


D 24 


F15 


5.3 


8.3 


4.4 


s 


D 2 5 


E17 


6.6 


9.9 


4.4 


s 


D 2 6 


F16 


5.3 


9.7 


4.4 


s 


D 27 


F19 


6.2 


11.7 


4.4 


s 


D 2 8 


G15 


5.1 


7.9 


4.4 


s 


D 2 9 


G19 


6.2 


11.8 


4.4 


s 


D30 


G16 


5.1 


8.9 


4.4 


s 


D31 


M19 


8.6 


16.2 


4.4 


s 


D32 


H15 


'5.2 


7.7 


4.4 


s 


D33 


R18 


11.0 


19.6 


4.4 


s 


D34 


P19 


8.0 


18.4 


4.4 


s 


D35 


R19 


9.1 


18.8 


4.4 


s 


D36 


N19 


8.1 


16.9 


4.4 


s 


D37 


S19 


9.2 


20.7 


4.4 


s 


D 38 


J16 


8.4 


8.9 


4.4 


s 


D39 


T19 


10.5 


19.6 


4.4 


s 


D 4 Q 


U19 


10.8 


19.1 


4.4 


s 


D41 


L16 


8.3 


10.9 


4.4 


s 


D 42 


T18 


10.5 


17.8 


4.4 


s 


D43 


K16 


8.4 


8.8 


4.4 


s 


D44 


U18 


10.1 


17.7 


4.4 


s 


D45 


K15 


9.3 


7.5 


4.4 


s 


D 46 


S17 


9.5 


14.5 


4.4 


s 


D47 


M16 


8.0 


9.8 


4.4 


s 


D 4 8 


L15 


8.0 


7.7 


4.4 


s 


D49 


T17 


8.7 


14.6 


4.4 


s 


D50 


N16 


7.8 


9.9 


4.4 


s 


D51 


U17 


8.6 


15.2 


4.4 


s 


D 52 


S18 


7.6 


14.3 


4.4 


s 
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Table 9.5 Buffer Models (Continued) 



Pin Name 


Location 


Cp(pF) 
Typical 


Lp(nH) 
Typical 


Input 
Buffer 
C| N (PF) 
Typical 


Output 

Buffer 

Size 

(Large or Small) 


D53 


M15 


7.7 


7.1 


4.4 


S 


D54 


Q16 


7.0 


11.1 


4.4 


S 


D55 


U16 


8.0 


14.3 


4.4 


s 


D56 


T16 


7.8 


12.8 


4.4 


s 


D57 


R16 


6.5 


11.8 


4.4 


s 


D58 


S16 


7.5 


11.3 


4.4 


s 


D59 


P15 


6.2 


8.7 


4.4 


s 


D 6 o 


R15 


7.1 


9.6 


4.4 


s 


Dei 


Q15 


5.9 


9.3 


4.4 


s 


D62 


S15 


6.9 


10.7 


4.4 


s 


D 6 3 


R14 


5.6 


9.7 


4.4 


s 


D/C# 


R05 


5.8 


9.7 




s 


DPO 


A15 


7.7 


18.3 


4.4 


s 


DP1 


A19 


9.7 


18.9 


4.4 


s 


DP2 


D13 


7.1 


8.5 


4.4 


s 


DP3 


D18 


6.7 


11.3 


4.4 


s 


DP4 


Q19 


10.4 


19.0 


4.4 


s 


DP5 


J15 


9.9 


7.7 


4.4 


s 


DP6 


P16 


9.3 


10.7 


4.4 


s 


DP7 


N15 


6.8 


8.9 


4.4 


s 


EADS# 


S05 


5.5 


10.5 


2.0 




EWBE# 


D08 


7.5 


7.6 


2.0 




FLINE# 


S08 


5.4 


8.1 


2.0 




HIT# 


S12 


5.9 


11.1 




s 


HITM# 


N05 


6.2 


8.2 




L 


HLDA 


S09 


5.3 


7.9 




s 


HOLD 


R12 


6.1 


11.1 


2.0 




INT/CS8 


S06 


5.2 


10.0 


2.0 




INV 


R07 


5.3 


8.2 


2.0 




KBO 


R11 


6.1 


9.2 




s 


KB1 


S10 


6.4 


7.9 




s 


KEN# 


U02 


7.4 


13.4 


2.0 




LEN 


T02 


7.9 


12.8 




s 
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Table 9.5 Buffer Models (Continued) 



Pin Name 


Location 


Cp(pF) 
Typical 


Lp(nH) 
Typical 


Input 
Buffer 
Cin(PF) 
Typical 


Output 

Buffer 

Size 

(Large or Small) 


LOCK# 


S03 


7.7 


11.2 




S 


M/IO# 


S04 


7.3 


10.3 




S 


NA# 


U03 


7.1 


13.0 


2.0 




NENE# 


S11 


6.3 


9.6 




s 


PCD 


R06 


5.6 


8.9 




S 


PCHK# 


H16 


5.1 


8.8 




s 


PCYC 


T04 


7.2 


11.4 




s 


PEN# 


R08 


4.8 


7.8 


2.0 




PWT 


T03 


7.4 


12.1 




S 


RESET 


S02 


7.9 


12.5 


2.0 




SPARE 


L04 








NC 


TCK 


Q01 


5.8 


14.1 


2.0 




TDI 


S14 


6.5 


9.8 


2.0 




TDO 


R10 


6.3 


7.6 




S 


TMS 


R13 


5.6 


9.6 


2.0 




TRST# 


S13 


6.3 


9.6 


2.0 




W/R# 


T01 


7.8 


14.3 




L/S 


WB/WT# 


U04 


6.7 


12.3 


2.0 
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10.0 INSTRUCTION SET 

Key to abbreviations: 

For register operands, the abbreviations that de- 
scribe the operands are composed of two parts. The 
first part describes the type of register: 

c One of the control registers fir, psr, epsr, 
dirbase, db, fsr, bear, ccr, pO, p1, p2, or p3 

/ One of the floating-point registers: fO through 
f31 

/ One of the integer registers: rO through r31 

The second part identifies the field of the machine 
instruction into which the operand is to be placed: 

srd The first of the two source-register desig- 
nators, which may be either a register or a 
16-bit immediate constant or address off- 
set. The immediate value is zero-extended 
for logical operations and is sign-extended 
for add and subtract operations (including 
addu and subu) and for all addressing cal- 
culations. 

srdni Same as srd except that no immediate 
constant or address offset value is permit- 
ted. ^ 

srds Same as srd except that the immediate 
constant is a 5-bit value that is zero-ex- 
tended to 32 bits. 

src2 The second of the two source-register des- 
ignators. 

dest The destination register designator. 

Thus, the operand specifier isrc2, for example, 
means that an integer register is used and that the 
encoding of that register must be placed in the src2 
field of the machine instruction. 

Other (nonregister) operands are specified by a one- 
part abbreviation that represents both the type of 
operand required and the instruction field into which 
the value of the operand is placed: 

# const A 16-bit immediate constant or address off- 
set that the i860 XP microprocessor sign- 
extends to 32 bits when computing the ef- 
fective address. 

Ibroff A signed, 26-bit, immediate, relative branch 
offset. 

shroff A signed, 16-bit, immediate, relative branch 
offset. 



brx A function that computes the target ad- 
dress by shifting the offset (either Ibroff or 
sbroff) left by two bits, sign-extending it to 
32 bits, and adding the result to the current 
instruction pointer plus four. The resulting 
target address may lie anywhere within the 
address space. 

Table 10.1. Precision Specification 



Suffix 


Source Precision 


Result Precision 


.ss 
.sd 
.dd 
.ds 


single 
single 
double 
double 


single, 
double 
double 
single 



.w 



Unless otherwise specificed, floating-point operations ac- 
cept single- or double-precision source operands and pro- 
duce a result of equal or greater precision. Both input oper- 
ands must have the same precision. The source and result 
precision are specified by a two-letter suffix to the mne- 
monic of the operation. 

Other abbreviations include: 

.p Precision specification .ss, 

.sd, or .dd (.ds not permit- 
ted). Refer to Table 10.1. 

.r Precision specification .ss, 

.sd, .ds, or .dd. Refer to 
Table 10.1. 

.v .sd or .dd Refer to Table 

10.1. 

.ss or .dd. Refer to Table 
10.1. 

.b (8 bits), .s (16 bits), or .I 
(32 bits) 

.I (32 bits), .d (64 bits), or 
.q (128 bits) 

The memory location indi- 
cated by address with a 
size of x. 

The I/O port indicated by 
address with a size of x. 

int-vector.x(address) The interrupt vector with a 
size of x returned from I/O 
port address. 

PM The pixel mask, which is 

considered as an array of 
eight bits PM(7)..PM(0), 
where PM(0) is the least- 
significant bit. 



mem.x(address) 



portx(address) 
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10-1 Instruction Definitions in Alphabetical Order 

adds isrd, isrc2, idest Add Signed 

idest <— isrd 4- isrc2 
OF <— (bit 31 carry # bit 30 carry) 
CC set if isrc2 4- isrd < (signed) 
CC clear if isrc2 4 isrd > (signed) 

addu isrd, isrc2, idest .Add Unsigned 

idest <— isrd 4- isrc2 
OF <— bit 31 carry 
CC <— bit 31 carry 

and isrd, isrc2, idest Logical AND 

idest <— isrd and isrc2 

CC set if result is zero, cleared otherwise 

andh #const, isrc2, idest Logical AND High 

idest <— (# const shifted left 16 bits) and isrc2 
CC set if result is zero, cleared otherwise 

andnot/src/, isrc2, idest Logical AND NOT 

idest <— (not isrd) and isrc2 

CC set if result is zero, cleared otherwise 

andnoth # const, isrc2, idest — — Logical AND NOT High 

idest <— (not (# const shifted left 16 bits)) and isrc2 
CC set if result is zero, cleared otherwise 

be Ibroff Branch on CC 

IF CC = 1 

THEN continue execution at brx(lbroff) 

Fl 

bet Ibroff Branch on CC, Taken 

IF CC = 1 

THEN execute one more sequential instruction 

continue execution at brx(lbroff) 
ELSE skip next sequential instruction 

Fl 

bla isrdni, isrc2, sbroff — Branch on LCC and Add 

LCC-temp clear if isrc2 4- isrdni < (signed) 

LCC-temp set if isrc2 4- isrdni ^ (signed) 

isrc2 <— isrdni + isrc2 

Execute one more sequential instruction 

IF LCC 

THEN LCC <- LCC-temp 

continue execution at brx(sbroff) 
ELSE LCC <— LCC-temp 

Fl 

bnc Ibroff Branch on Not CC 

IF CC = 

THEN continue execution at brx(lbroff) 

Fl 
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bnc.t Ibroff Branch on Not CC, Taken 

IF CC = 

THEN execute one more sequential instruction 

continue execution at brx(lbroff) 
ELSE skip next sequential instruction 

Fl 

br Ibroff Branch Direct Unconditionally 

Execute one more sequential instruction. 
Continue execution at brx(lbroff). 

bri [isrdni\ Branch Indirect Unconditionally 

Execute one more sequential instruction 
IF any trap bit in psr is set 

THEN copy PU to U, PIM to IM in psr 

clear trap bits 

IF DS is set and DIM is reset 

THEN enter dual-instruction mode after executing one 

instruction in single-instruction mode 
ELSE IF DS is set and DIM is set 

THEN enter single-instruction mode after executing one 

instruction in dual-instruction mode 
ELSE IF DIM is set 

THEN enter dual-instruction mode 

for next instruction pair 
ELSE enter single-instruction mode 

for next instruction pair 



Fl 



Fl 



Fl 
Fl 
Continue execution at address in isrrfni 

(The original contents of isrdni is used even if the next instruction 

modifies isrdni. Does not trap if isrdni is misaligned.) 

bte isrds, isrc2, sbroff , . . — Branch If Equal 

IF isrds = isrc2 

THEN continue execution at brx(sbroff) 

Fl 

btne isrds, isrc2, sbroff Branch If Not Equal 

IF isrds # isrc2 

THEN continue execution at brx(sbroff) 

Fl 

call Ibroff — — .Subroutine Call 

rl <— address of next sequential instruction + 4 (or 4- 8 in dual mode) 
Execute one more sequential instruction 
Continue execution at brx(lbroff) 

calli [isrdni] Indirect Subroutine Call 

ri «—- address of next sequential instruction + 4 (or + 8 in dual mode) 
Execute one more sequential instruction 
Continue execution at address in isrdni 

(The original contents of isrdni is used even if the next instruction 

modifies isrdni. Does not trap if isrdni is misaligned. The 

register isrdni must not be r1.) 

fadd.p fsrd, fsrc2, fdest Floating-Point Add 

fdest <r- fsrd + fsrc2 
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faddp fsrd, fsrc2, fdest. Add with Pixel Merge 

fdest <r- fsrd + fsrc2 (using integer arithmetic; 8-byte operands and destination) 
Shift and load MERGE register from fsrd + fsrc2 as defined in Table 10.2 

faddz fsrd, fsrc2, fdest Add with Z Merge 

fdest <— fsrd + fsrc2 (using integer arithmetic; 8-byte operands and destination) 
Shift MERGE right 16 and load fields 31.. 16 and 63. .48 from fsrd + fsrc2 

famov.r fsrd, fdest Floating-Point Adder Move 

fdest <— fsrd 

fiadd.w fsrd, fsrc2, fdest Long-Integer Add 

fdest <— fsrd + fsrc2 (2's complement integer arithmetic) 

fisub.w fsrd, fsrc2, fdest Long-Integer Subtract 

frdest <— fsrd - fsrc2 (2's complement integer arithmetic) 

fix.v fsrd, fdest Floating-Point to Integer Conversion 

fdest «— 64-bit value with low-order 32 bits equal to integer part of fsrd rounded 

Floating-Point Load 

fld.y isrd(isrc2), fdest (Normal) 

fld.y isrd(isrc2)+ +, fdest (Autoincrement) 

fdest <r— mem.y {fsrd + isrc2) 

IF autoincrement 

THEN isrc2 «- fsrd + isrc2 

Fl 

Cache Flush 

flush #const(/src2) (Normal) 

flush #const(isrc2)+ + (Autoincrement) 

Write back (if modified) the line in data cache that has address (# const +isrc2) 
80860XR: and set tag value to (# const + isrc2). 
80860XP: and invalidate its virtual and physical tags. 
Contents of line undefined. 
IF autoincrement 
THEN isrc2 <— #const + isrc2 
Fl 




fmlow.dd fsrd, fsrc2, fdest Floating-Point Multiply Low 

fdest <— low-order 53 bits of (fsrd mantissa x fsrc2 mantissa) 
fdest bit 53 <— most significant bit of (fsrc /mantissa x fsrc2 mantissa) 



fmov.r fsrd, fdest 

Assembler pseudo-operation 
fmov.ss fsrd, fdest 
fmov.dd fsrd, fdest 
fmov.sd fsrd, fdest 
fmov.ds fsrd, fdest 



. Floating-Point Reg-Reg Move 



= fiadd.ss fsrd, fO, fdest 
= fiadd.dd fsrd, fO, fdest 
= famov.sd fsrd, fdest 
= famov.ds fsrd, fdest 



fmul.p fsrd, fsrc2, fdest .Floating-Point Multiply 

fdest <— fsrd x fsrc2 

fnop Floating-Point No Operation 

Assembler pseudo-operation 
fnop = shrd rO, rO, rO 
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form fsrd, fdest . . . OR with MERGE Register 

fdest <— fsrtf OR MERGE 
MERGE <- 

f rcp.p fsrc2, fdest Floating-Point Reciprocal 

fdest <— 1 / fsrc2\N\Xh maximum mantissa error < 2 -7 

frsqr.p fsrc2, fdest . Floating-Point Reciprocal Square Root 

fdest <— 1 / 4fsrc2 with maximum mantissa error < 2 -7 

Floating-Point Store 

fst.y fdest, isrc1(isrc2) .(Normal) 

fst.y fdest, isrd(isrc2)+ + . ....... (Autoincrement) 

mem.y (isrc2 + isrd) <— fdest 

IF autoincrement 

THEN isrc2 <— isrd + isrc2 

Fl 

f sub.p fsrc 1, fsrc2, fdest .......... — — . . Floating-Point Subtract 

fdest <— fsrd - fsrc2 

ftrunc.v fsrd, fdest * Floating-Point to Integer Conversion 

fdest ''•<— 64-bit value with low-order 32 bits equal to integer part of fsrd 

fxfr fsrd, idest .Transfer F-P to Integer Register 

idest <— fsrd 

fzchkl fsrd, fsrc2, fdest 32-Bit Z-Buffer Check 

Consider the 64-bit operands as arrays of two 32-bit 

fields fsrdO)..fsrd(0), fsrc2tf)Jsrc2(0), and fdesW)..fdest(0) 

where zero denotes the least-significant field. 
PM <— PM shifted right by 2 bits 
FORi = 0to1 
DO 

PM [i 4- 6] <— fsrc2(\) <, fsrd(\) (unsigned) 

fdesH}) <~ smaller of fsrc2(\) and fsrdi}) 
OD 
MERGE <- 

fzchks fsrd, fsrc2, fdest ... 16-Bit Z-Buffer Check 

Consider the 64-bit operands as arrays of four 16-bit 

fields fsrd(3)..fsrd(0), fsrc2(3)..fsrc2(0), and fdest(3)..fdest{0) 

where zero denotes the least-significant field. 
PM <— PM shifted right by 4 bits 
FOR i = 0to3 
DO 

PM [i + 4] «— fsrc2(\) < fsrd(\) (unsigned) 

fdest{\) <r- smaller of fsrc2(\) and fsrd(\) 
OD 
MERGE <- 

intovr . . . . Software Trap on Integer Overflow 

IF OF =1 

THEN generate trap with IT set in psr 

Fl 

ixfr isrdni, fdest — Transfer Integer to F-P Register 

fdest <— isrdni 
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Id.c csrc2, idest Load from Control Register 

idest <r— csrc2 

Id.x isrd{isrc2), idest Load Integer 

idest <— mem.x (isrd + isrc2) 

Idint.x isrc2, idest Load Interrupt Vector 

idest <— int_vector.x (isrc2) 

NOTE: Not available with the i860 XR CPU 

Idio.x isrc2, idest Load I/O 

idest <— port.x (isrc2) 

NOTE: Not available with the i860 XR CPU 

lock Begin Interlocked Sequence 

Set BL in dirbase. 

The next load or store that appears on the bus locks that location. 

Disable interrupts until the bus is unlocked. 

mov isrc2, idest Register-Register Move 

Assembler pseudo-operation 

mov isrc2, idest = shl rO, isrc2, idest 

mov const32, idest ..'. Constant-to-Register Move 

Assembler pseudo-operation 

when 0xFFFF8000 ^ const32 < 0x8000 ... 

adds l%const32, rO, idest 
otherwise . . . 

orh h%const32, rO, idest 
or l%const32, idest, idest 

nop Core-Unit No Operation 

Assembler pseudo-operation 
nop = shl rO, rO, rO 

or isrd, isrc2, idest Logical OR 

idest «— isrd OR isrc2 

CC set if result is zero, cleared otherwise 

orh # const, isrc2, idest Logical OR high 

idest <— (# const shifted left 16 bits) OR isrc2 
CC set if result is zero, cleared otherwise 

pfadd.p fsrd, fsrc2, fdest Pipelined Floating-Point Add 

fdest <■— last stage adder result 

Advance A pipeline one stage 

A pipeline first stage <— fsrd + fsrc2 

pfaddp fsrd, fsrc2, fdest Pipelined Add with Pixel Merge 

fdest <^- last-stage graphics-unit result 
last-stage graphics-unit result <— fsrd + fsrc2 

(using integer arithmetic; 8-byte operands and destination) 
Shift, then load MERGE register from fsrd + fsrc2 as defined in Table 10.2 

pfaddz fsrd, fsrc2, fdest Pipelined Add with Z Merge 

frdest <— last-stage graphics-unit result 
last-stage graphics-unit result <— fsrd + fsrc2 

(using integer arithmetic; 8-byte operands and destination) 
Shift MERGE right 16, then load fields 31.. 16 and 63..48 iromfsrd + fsrc2 
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pfam.p fsrd, fsrc2, fdest Pipelined Floating-Point Add and Multiply 

fdest <r— last stage adder result 

Advance A and M pipeline one stage (operands accessed before advancing pipeline) 

A pipeline first stage <— A-op1 + A-op2 

M pipeline first stage <— M-op1 x M-op2 

pfamov.r fsrd, fdest Pipelined Floating-Point Adder Move 

fdest <— last stage adder result 
Advance A pipeline one stage 
A pipeline first stage <— fsrd 

pfeq.p fsrd, fsrc2, fdest Pipelined Floating-Point Equal Compare 

fdest <r- last stage adder result 

CC set if fsrd = fsrc2, else cleared 

Advance A pipeline one stage 

A pipeline first stage is undefined, but no result exception occurs 

pfgt.p fsrd, fsrc2, fdest Pipelined Floating-Point Greater-Than Compare 

(Assembler clears R-bit of instruction) 

fdest <■— last stage adder result 

CC set if fsrd > fsrc2, else cleared 

Advance A pipeline one stage 

A pipeline first stage is undefined, but no result exception occurs 

pf iadd.w fsrd, fsrc2, fdest Pipelined Long-Integer Add 

fdest <— last-stage graphics-unit result 

last-stage graphics-unit result <— fsrd + fsrc2 (2's complement integer arithmetic) 

pfisub.w fsrd, fsrc2, fdest Pipelined Long-Integer Subtract 

fdest <— last-stage graphics-unit result 

last-stage graphics-unit result <■— fsrd - fsrc2 (2's complement integer arithmetic) 

pfix.v fsrd, fdest Pipelined Floating-Point to Integer Conversion 

fdest <— last stage adder result 
Advance A pipeline one stage 

A pipeline first stage *— 64-bit value with low-order 32 bits 
equal to integer part of fsrd rounded 

Pipelined Floating-Point Load 

pfld.y /srd(/src2), fdest (Normal) 

pfld.y isrdi/src2)+ + , fdest (Autoincrement) 

fdest *— mem.y (third previous pfld's (isrd + isrc2)) 
(where .y is precision of third previous pfld.y) 

IF autoincrement 

THEN isrc2 <—■ isrd +. isrc2 

Fl 

NOTE: pfld.q is not available with the i860 XR CPU 

pfle.p fsrd, fsrc2, fdest Pipelined F-P Less-Than or Equal Compare 

Assembler sets R-bit of instruction 
fdest <— last stage adder result 
CC clear if fsrd < fsrc2, else set 
Advance A pipeline one stage 
A pipeline first stage is undefined, but no result exception occurs 
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pfmam.p fsrd, fsrc2, fdest Pipelined Floating-Point Add and Multiply 

fdest <— last stage multiplier result 

Advance A and M pipeline one stage (operands accessed before advancing pipeline) 

A pipeline first stage <— A-op1 4- A-op2 

M pipeline first stage <— M-op1 x M-op2 

pfmov.r fsrd, fdest Pipelined Floating-Point Reg-Reg Move 

Assembler pseudo-operation 

pfmov.ss fsrd, fdest = pfiadd.ss fsrd, fO, fdest 
pfmov.dd fsrd, fdest = pfiadd.dd fsrd, fO, fdest 
pfmov.sd fsrd, fdest = pfamov.sd fsrd, fdest 
pfmov.ds fsrd, fdest = pfamov.ds fsrd, fdest 

pfmsm.p fsrd, fsrc2, fdest Pipelined Floating-Point Subtract and Multiply 

fdest <— last stage multiplier result 

Advance A and M pipeline one stage (operands accessed before advancing pipeline) 

A pipeline first stage <— A-op1 - A-op2 

M pipeline first stage <— M-op1 x M-op2 



pfmul.p fsrd, fsrc2, fdest 

fdest <— last stage multiplier result 

Advance M pipeline one stage 

M pipeline first stage <— fsrd x fsrc2 

pfmul3.dd fsrd, fsrc2, fdest 

fdest «— last stage multiplier result 
Advance 3-Stage M pipeline one stage 
M pipeline first stage <— fsrd x fsrc2 

pform fsrd, fdest 

fdest <— last-stage graphics-unit result 

last-stage graphics-unit result <— fsrd OR MERGE 

MERGE <- 



.Pipelined Floating-Point Multiply 




.Three-Stage Pipelined Multiply 



. Pipelined OR to MERGE Register 



pfsm.p fsrd, fsrc2, fdest Pipelined Floating-Point Subtract and Multiply 

fdest <r— last stage adder result 

Advance A and M pipeline one stage (operands accessed before advancing pipeline) 

A pipeline first stage «— A-op1 - A-op2 

M pipeline first stage <— M-op1 x M-op2 

pfsub.p fsrd, fsrc2, fdest .Pipelined Floating-Point Subtract 

fdest <— last stage adder result 

Advance A pipeline one stage 

A pipeline first stage <— fsrd - fsrc2 

pftrunc.v fsrd, fdest Pipelined Floating-Point to Integer Conversion 

fdest <— last stage adder result 
Advance A pipeline one stage 

A pipeline first stage <— 64-bit value with low-order 32 bits 
equal to integer part of fsrd 
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pfzchkl fsrd, fsrc2, fdest . . — — . . Pipelined 32-Bit Z-Buffer Check 

Consider the 64-bit operands as arrays of two 32-bit 

fields fsrd(-\)..fsrc1(0), fsrc2(1)..fsrc2(0), and fdest(A)Jdest(0) 

where zero denotes the least-significant field. 
PM <r- PM shifted right by 2 bits 
FOR i = to 1 
DO 

PM [i + 6] <— fsrc2(\) < fsrd(\) (unsigned) 

fdestfj) <— last-stage graphics-unit result 

last-stage graphics-unit result <— smaller of fsrc2{}) and fsrd 
OD 
MERGE «- 

pfzchks fsrd, fsrc2, fdest . ............. Pipelined 16-Bit Z-Buffer Check 

Consider the 64-bit operands as arrays of four 1 6-bit 

fields fsrd(3)..fsrd{0), fsrc2(3)..fsrc2(0), and fdest(3)..fdest{0) 

where zero denotes the least-significant field. 
PM «- PM shifted right by 4 bits 
FOR i = to 3 
DO 

PM [i + 4] <— fsrc2(\) ^ fsrd{}) (unsigned) 

fdest <— last-stage graphics-unit result 

last-stage graphics-unit result(i) «— smaller of fsrc2(ji) and fsrd if) 
OD 
MERGE ■«- 

pst.d fdest, # const(isrc2) Pixel Store 

pst.d fdest, # const(/src2) + + — — Pixel Store Autoincrement 

Pixels enabled by PM in mem.d (isrc2 + # const) <— fdest 

Shift PM right by 8/pixel size (in bytes) bits 

IF autoincrement 

THEN isrc2 <— #const + isrc2 

Fl 

scyc.x isrc2 — — — Special Cycles 

Generate a special bus cycle (D/C#=0, W/R# = 1,..M/IO# = 0) and 
set BE7#-BE0# according to the value contained in the register isrc2 
NOTE: Not available with the i860 XR CPU 

shl isrd, isrc2, idest — Shift Left 

fdest <— isrc2 shifted left by isrd bits 

shr isrd, isrc2, idest .Shift Right 

SC (in psr) <— isrd 

idest «— isrc2 shifted right by isrd bits 

shra isrd, isrc2, idest Shift Right Arithmetic 

idest +- isrc2 arithmetically shifted right by isrd bits 

shrd isrdni, isrc2, idest ........ .Shift Right Double 

idest <— low-order 32 bits of isrdni:isrc2 shifted right by SC bits 

st.c isrdni, csrc2 Store to Control Register 

csrc2 <r- srdni 

st.x isrdni, # const(isrc2) Store Integer 

mem.x (isrc2 + # const) «— isrdni 
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stio.x isrrfni, isrc2 

port.x (isrc2) <— isrdni 

NOTE: Not available with the i860 XR CPU 



Store I/O 



subs isrrf, isrc2, idest Subtract Signed 

idest *— isrd - isrc2 
OF <— (bit 31 carry ¥= bit 30 carry) 
CC set if isrc2 > isrd (signed) 
CC clear if isrc2 <> isrd (signed) 

subu isrd, isrc2, idest Subtract Unsigned 

idest <— isrd - isrc2 
OF <r- NOT (bit 31 carry) 
CC «- bit 31 carry 
(i.e. CC set if isrc2 <, isrd (unsigned) 

CC clear if isrc2 > isrd (unsigned)) 

trap isrdni, isrc2, idest Software Trap 

Generate trap with IT set in psr 

unlock End Interlocked Sequence 

Clear BL in dirbase. The next load or store 
unlocks the bus. Interrupts are enabled. 

xor isrd, isrc2, idest Logical Exclusive OR 

idest <— isrd XOR isrc2 

CC set if result is zero, cleared otherwise 

xorh #const, isrc2, idest — Logical Exclusive OR High 

idest <— (# const shifted left 16 bits) XOR isrc2 
CC set if result is zero, cleared otherwise 








Table 10.2. FADDP MERGE Update 




Pixel Size 
(from PS) 


Fields Loaded from 
Result into MERGE 


Right Shift Amount 
(Field Size) 


8 

16 
32 


63..56, 
63..58, 
63..56, 


47..40, 31. .24, 15..8 

47..42, 31. .26, 15-10 

31..24 


8 
6 
8 
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10.2 Instruction Format and Encoding 10.2.1 reg-format instructions 



All instructions are 32 bits long and begin on a four- 
byte boundary. When operands are registers, the 
encodings shown in Table 10.3 are used. 

There are two general core-instruction formats 
(REG-format and CTRL-format) and a separate for- 
mat for floating-point instructions. 

Table 10.3. Register Encoding 



Register 


Encoding 


rO 
r31 



31 


to 

f31 



31 


Fault Instruction 
Processor Status 
Directory Base 
Data Breakpoint 
Floating-Point Status 
Extended Processor Status 



1 
2 
3 
4 
5 


Bus Error Address* 
Concurrency Control* 

pO* 

P 1* 

p2* 

p3* 


6 
7 
8 
9 
10 
11 



Within the REG-format are several variations as 
shown in Figure 10.1. Table 10.4 gives the encod- 
ings for these instructions. One encoding is an es- 
cape code that defines yet another variation: the 
core escape instructions. Figure 10.2 shows the for- 
mat of this group, and Table 10.5 shows the encod- 
ings. 

In these instructions, the src2 field selects one of 
the 32 integer registers (most instructions) or one of 
the control registers (st.c and Id.c): Dest selects 
one of the 32 integer registers, (most instructions) or 
floating-point registers (fid, fst, pfld, pst, ixfr). For 
instructions where srd is optionally an immediate 
value, bit 26 of the opcode (l-bit) indicates whether 
srd is an immediate. If bit 26 is clear, an integer 
register is used; if bit 26 is set, srd is contained in 
the low-order 16 bits, except for bte and btne 
instructions. For bte and btne, the five-bit immediate 
value is contained in the srd field. For st, bte, btne, 
and bla, the upper five bits of the offset or broffset 
are contained in the desti\e\d instead of srd, and 
the lower 11 bits of offset are the lower 11 bits of 
the instruction. 

For Id and st, bits 28 and zero determine operand 
size as follows: 



Bit 28 


BitO 


Operand Size 




1 
1 



1 


1 


8-bits 
8-bits 
16-bits 
32-bits 



When srd is immediate and bit 28 is set, bit zero of 
the immediate value is forced to zero. 



NOTE: 

*Available only with i860 XP CPU. Using these encodings 
with the i860 XR CPU produces undefined results. 
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For fid, fst, pfld, pst, and flush, bit selects autoin- 
crement addressing if set. For fid, fst, pfld, and pst, 

bits one and two select the operand size as follows: 



Bit 1 


Bit 2 


Operand Size 





1 
1 



1 

1 


64-bits 
128-bits 
32-bits 
32-bits 



When srd is immediate, bits zero and one of the 
immediate value are forced to zero to maintain align- 
ment. When bit one of the immediate value is clear, 
bit two is also forced to zero. 

For the instructions Idio, stio, Idint, and scyc, the 

operand size is encoded by bits 9 and 10 as follows. 
For other instructions, these bits are reserved and 
should be set to zero. 



For flush, bits one and two must be zero. 



Operand Size 


Bit 10 


Bit 9 


8 Bits (.b) 
16 Bits (.s) 
32 Bits (.1) 
reserved 




1 
1 



1 

1 




* 31 30 29 28 27 26/25 24 23 22 21/2Q 19 18 17 16(15 14 13 12 11 (10 98765432 1 C 



OPCODE/I 



SRC2 



DEST 



SRC1 



IMMEDIATE, OFFSET, 
OR NULL 



131 30 29 2827(26/25 24 23 22 21/20 19 18 17 16 


(15 14 13 12 11 W 9 8 7 6 5 4 


3 2 


ibl 


OPCODE 


1 


SRC2 


DEST 


IMMEDIATE 




\ 


\ \\ 


^ 


\ 




\ 



(3130 29 282726/2524 2322 21/20 19 18 17 16/15 14 13 12 11/10 98765432 1 Ol 


OPCODE/I 


SRC2 


OFFSET 
HIGH 


SRC1 
SRC1S 


OFFSET LOW 


Y v \ \ \ 





(3130 292827/26/2524 2322 21(20 19 18 17 16(15 14 13 12 11(10 9 8 7 6 5 4 3 2 1 Ok 



SRC2 



OFFSET 
HIGH 



OFFSET LOW 



Figure 10.1. REG-Format Variations 
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Table 1 0.4. REG-Format Opcodes 
31 30 



29 



28 



L Integer Length 

— 8 bits 

1 —1 6 or 32 bits (selected by bit 0) 
LS Load/Store 

—Load 

1 —Store 
SO Signed/Ordinal 

—Ordinal 

1 — Signed 
H High 

— and, or, andnot, xor 

1 — andh, orh, andnoth, xorh 



AS Add/Subtract 

—Add 

1 —Subtract 
LR Left/Right 

—Left Shift 

1 —Right Shift 
E Equal 

—Branch on Unequal 

1 —Branch on Equal 
I Immediate 

— srd is register 

1 — srd is immediate 



27 



26 



Id.x Load Integer 

st.x Store Integer 

ixfr Integer to F-P Reg Transfer 

— (reserved) 




















L 
L 


1 




1 
1 
1 


I 

1 




fld.x, fst.x Load/Store F-P 

flush Flush 

pst.d Pixel Store 

Id.c, st.c Load/Store Control Register 














1 
1 
1 
1 




1 
1 
1 


LS 


1 

LS 


I 
1 
1 



bri Branch Indirect 
trap Trap 

— (Escape for F-P Unit) 

— (Escape for Core Unit) 
bte, btne Branch Equal or Not Equal 
pf Id.y Pipelined F-P Load 

— (CTRL-Format Instructions) 











,1 

1 
1 
1 
1 
1 
1 






Q 
b 
1 
1 







1 



X 





1 
1 

E 


X 




1 



1 
I 
I 

X 


addu, -s, subu, -s Add/Subtract 

shl, shr Logical Shift 

shrd Double Shift 

bla Branch LCC Set and Add 

shra Arithmetic Shift 


1 
1 
1 
1 
.1 











1 

1 

1 

, 1 


so 



1 
1 
1 


AS 
LR 





1 


I 
I 


1 
I 


and(h) AND 
andnot(h) ANDNOf 
or(h) OR 
xor(h) XOR 

— (reserved) 


1 
1 
1 
1 
1 


i 
1 
1 
1 
1 





1 
1 

x 




1 



1 

. X. 


H 
H 
H 
H 

1 


I 
I 
I 
I 





(31 30 292827 26/25 24 23 22 21/20 19 18 17 16/15 14 t3 12 11/10 9/8 7 6 5/432 10/ 


10 1 1 


SRC2 


DEST 


SRC1 


SIZE 





OPCODE j 


\ V 


\ \ \ \ \ > 



H RESERVED BY INTEL CORPORATION (SET TO ZERO) 



Figure 10.2. Core Escape Instructions 
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Table 10.5. Core Escape Opcodes 
4 3 



— 


(reserved) 

















lock 


Begin Interloacked Sequence 














1 


calli 


Indirect Subroutine Call 











1 





— 


(reserved) 











1 


1 


introvr 


Trap on Integer Overflow 








1 








— 


(reserved) 








1 





1 


— 


(reserved) 








1 


1 





unlock 


End Interlocked Sequence 








1 


1 


1 


Idio* 


Load I/O 
















stio* 


Store I/O 













1 


Idint* 


Load Interrupt Vector 










1 





scyc* 


Special Cycles 










1 


1 


— 


(reserved) 







1 


X 


X 


— 


(reserved) 


1 





X 


X 


X 


— 


(reserved) 


1 


1 


X 


X 


X 




NOTE: 

'Available only with i860 XP CPU, not with i860 XR CPU 

10.2.2 CTRL-FORMAT INSTRUCTIONS 

The CTRL-Format instructions do not refer to registers; so, instead of the register fields, they have a 26-bit 
relative branch offset. Figure 10.3 shows the format of these instructions and Table 10.6 defines the encod- 
ings. 











240874-79 




131 30 29/23 27 26/25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 


/ 01 




1 1 


OPC 


BROFFSET 




I 


I \ v 


NOTE: 

BROFFSET is a sigr 


led 26-bit relative branch offset 







Figure 10.3. CTRL-Format Instructions 





Table 10.6. CTRL-Format Opcodes 








28 


27 


26 


— 


(reserved) 











— 


(reserved) 








1 


br 


Branch Direct 





1 





call 


Call 





1 


1 


bc(.t) 


Branch on CC Set 


1 





T 


bnc(.t) 


Branch on CC Clear 


1 


1 


T 



T Taken 

—be or bnc 

1 —bet or bnc.t 



10.2.3 FLOATING-POINT INSTRUCTION 
ENCODING 

The floating-point instructions also constitute an es- 
cape series. All these instructions begin with the bit 
sequence 010010. Figure 10.4 shows the format of 



the floating-point instructions, and Table 10.7 gives 
the encodings. Within the dual-operation instructions 
is a subcode DPC whose values are given in Table 
1 0.9 along with the mnemonic that corresponds to 
each. 
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131 30 29 28 27 26/25 24 23 22 21 


/20 19 18 17 16/15 14 13 12 11/tO / 9 / 8 / 7 / 6 5 4 3 2 1 01 




: 


10 10 


SRC2 


DEST 

I \ \ 


SRC1 


P 


D 


s 


R 


OPCODE 


\ \ 




V\.\ \ 


240874-80 

SRC1, SRC2. Source; one of 32 floating-point registers 

DEST Destination; one of 32 floating-point registers (except f xf r; one of 32 integer registers) 

P Pipelining 

1 Pipelined instruction mode 

Scalar instruction mode 
D Dual-Instruction Mode 

1 Dual-instruction mode 
Single-instruction mode 

S Source Precision 

, 1 Double-precision source operands 

Single-precision source operands 
R Result Precision 

1 Double-precision result 
Single-precision result 



Figure 10.4. Floating-Point Instruction Encoding 

Table 10.7. Floating-Point Opcodes 

6 5 4 3 



pfam Add and Multiply* 
pfmam Multiply with Add* 
pfsm Subtract and Multiply* 
pfmsm Multiply with Subtract* 












1 


DPC 
DPC 


(p)fmul Multiply 

fmlow Multiply Low 

frcp Reciprocal 

frsqr Reciprocal Square Root 

pfmul3.dd 3-Stage Pipelined Multiply 









1 
1 
1 
1 
1 





















1 





1 
1 






1 



1 




(p)fadd Add 
(p)fsub Subtract 
(p)fix Fix 
(p)famov Adder Move 
pfgt/pfle** Greater Than 
pfeq Equal 
(p)ftrunc Truncate 







b 





1 

1 

1 

1 . 

1 

1 

1 











1 







1 

1 







1 
1 




1 




1 



1 



1 




fxfr Transfer to Integer Register 
(p)fiadd Long-Integer Add 
(p)fisub Long-Integer Subtract 
















1 
1 





1 









1 
1 


(p)fzchkl Z-Check Long 
(p)fzchks Z-Check Short 
(p)faddp Add with Pixel Merge 
(p)faddz Add with Z Merge 
(p)form OR with MERGE Register 















1 




1 


1 
1 






1 
1 




1 


1 

1 



1 





NOTE: 

All opcodes not shown are reserved. 

* pfam and pfsm have P-bit set; pfmam and pfmsm have P-bit clear. 

** pfgt has R bit cleared; pfle has R bit set. 
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Table 10.8. DPC Encoding 








DPC 


PFAM 


PFSM 


M-Unit 


M-Unit 


A-Unit 


A-Unit 


T 


K 


Mnemonic 


Mnemonic 


01 


op2 


op1 


op2 


Load 


Load* 


0000 


r2p1 


r2s1 


KR 


src2 


srd 


M result 


No 


No 


0001 


r2pt 


r2st 


KR 


src2 


T 


M result 


No 


Yes 


0010 


r2ap1 


r2as1 


KR 


src2 


srd 


A result 


Yes 


No 


0011 


r2apt 


r2ast 


KR 


src2 


T 


A result 


Yes 


Yes 


0100 


i2p1 


i2s1 


Kl 


src2 


srd 


M result 


No 


No 


0101 


i2pt 


i2st 


Kl 


src2 


T 


M result 


No 


Yes 


0110 


I2ap1 


i'2as1 


Kl 


src2 


srd 


A result 


Yes 


No 


0111 


i2apt 


i'2ast 


Kl 


src2 


T 


A result 


Yes 


Yes 


1000 


rat1p2 


rat1s2 


KR 


A result 


srd 


src2 


Yes 


No 


1001 


m12apm 


m12asm 


srd 


src2 


A result 


M result 


No 


No 


1010 


ra1p2 


ra2s2 


KR 


A result 


srd 


src2 


No 


No 


1011 


m12ttpa 


m12ttsa 


srd 


src2 


T 


A result 


Yes 


No 


1100 


iat1p2 


iat1s2 


Kl 


A result 


srd 


src2 


Yes 


No 


1101 


m12tpm 


m12tsm 


srd 


src2 


T 


M result 


No 


No 


1110 


ia1p2 


ia1s2 


Kl 


A result 


srd 


src2 


No 


No 


1111 


m12tpa 


m12tsa 


srd 


src2 


T 


A result 


No 


No 


DPC 


PFMAM 


PFMSM 


M-Unit 


M-Unit 


A-Unit 


A-Unit 


T 


K 


Mnemonic 


Mnemonic 


opt 


op2 


op1 


op2 


Load 


Load* 


0000 


mr2p1 


mr2s1 


KR 


src2 


srd 


M result 


No 


No 


0001 


mr2pt 


mr2st 


KR 


src2 


T 


M result 


No 


Yes 


0010 


mr2mp1 


mr2ms1 


KR 


src2 


srd 


M result 


Yes 


No 


0011 


mr2mpt 


mr2mst 


KR 


src2 


T. 


M result 


Yes 


Yes 


0100 


mi2p1 


mi2s1 


Kl 


src2 


srd 


M result 


No 


No 


0101 


mi2pt 


mi2st 


Kl 


src2 


T 


M result 


No 


Yes 


0110 


mi2mp1 


mi2ms1 


Kl 


src2 


srd 


M result 


Yes 


No 


0111 


mi2mpt 


mi2mst 


Kl 


src2 


T 


M result 


Yes 


Yes 


1000 


mrmt1p2 


mrmtl s2 


KR 


M result 


srd 


src2 


Yes 


No 


1001 


mm12mpm 


mm12msm 


srd 


src2 


M result 


M result 


No 


No 


1010 


mrm1p2 


mrm1s2 


KR 


M result 


srd 


src2 


No 


No 


1011 


mm12ttpm 


mm12ttsm 


srd 


src2 


T 


M result 


Yes 


No 


1100 


mimt1p2 


mimt1s2 


Kl 


M result 


srd 


src2 


Yes 


No 


1101 


mm12tpm 


mm12tsm 


srd 


src2 


T 


M result 


No 


No 


1110 

1111 


mim1p2 


mjm1s2 


Kl 


M result 


srd 


src2 


No 


No 


Intel Reserved 




NOTE: 

* If K-load is set, KR is loaded when operand-1 of the multiplier is KR; Kl is loaded when operand-1 of the multiplier is Kl. 



2-135 



srry« 



186OTM XP MICROPROCESSOR 



PI^OGM^InW 



10.3 Instruction Timings 

Generally, i860 XP microprocessor instructions take 
one clock to execute unless a freeze condition is 
invoked. Detailed times, along with freeze conditions 
and their associated delays, are shown in the table 
on the following pages. The following symbols are 
used for brevity in the timing table: 

+ n n clocks must be added to the execution 
time if the stated conditions apply. 

< — > n The processor requires at least n clocks be- 
tween the indicated instructions. The actual 
delay will be n minus the number of clocks 
for executing intervening instructions (or 
dual-mode pairs). If the time for intervening 
instructions is ^ n, there is no delay. 

n..m Indicates a range of clocks. These cases 
are accompanied by a reference to a note 
where further explanation is available. 

XR: Applies to i860 XR microprocessors only. 

XP: Applies to i860 XP microprocessors only. 

OA The number of clocks to finish all outstand- 
ing accesses. 

R1 The number of clocks from ADS# through 
the first READY# (80860XR) or BRDY# 
(80860XP) of the indicated bus activity. 

R2 The number of clocks from ADS# through 
the second READY# or BRDY#. 

RL The number of clocks from ADS# through 
the last READY# or BRDY#. 

RL1 XP: The number of clocks through last 
BRDY# of first access. 

RN XR: The number of clocks until next nonre- 
peated address can be issued (i.e., an ad- 
dress that is not the 2nd-4th cycle of a 
cache fill, the 2nd -8th cycle of a CS8 mode 
instruction fetch, nor the 2nd cycle of a 1 28- 
bit write). 

RX The number of clocks through READY # or 
BRDY# for the next 64-bit-or-less write cy- 
cle or second READY# or BRDY# for the 
next 1 28-bit write cycle. 

NOTES: 

a. "Address path full" means one address inter- 
nally waiting for bus while external bus pipeline 
full. 



b. "Store path full" means two stores or one 256- 
bit write-back internally waiting for bus plus ex- 
ternal bus pipeline full. 

c. If a floating-point instruction, graphics-unit in- 
struction, fst, or pst is executed when a scalar 
floating-point operation (other than frcp or 

r frsqr) is in progress, the scalar operation must 
complete first: two additional clocks for fadd, 
fix, fmlow, fmul.ss, fmul.sd, ftrunc, and 
fsub; three additional clocks for f mul.dd. Add 
one if either or both of these situations occur: 

1 . There is an overlap between the result reg- 
ister of the previous scalar operation and 
the source of the floating-point operation, 
and the destination precision of the scalar 
operation differs from the source precision 
of the floating-point operation. 

2. The floating-point operation is pipelined 
and its destination is not fO. 

TLB TLB miss. Five clocks plus the number of 
clocks to finish two reads plus the number of 
clocks to set A-bits (if necessary). 

In addition, any instruction may be delayed due to an 
instruction cache miss or TLB miss during the in- 
struction fetch. The time for a TLB miss is shown 
above in note TLB. An instruction cache miss adds 
the following delays: 

• The number of clocks to get the next instruction 
from the bus (ADS# clock to first READY# or 
BRDY# clock, inclusive). 

• XR: When any of the instructions in the new in- 
struction-cache line is a branch or call or causes 
a freeze, the time through the last READY# for 
the new line. 

• If the data cache is being accessed when the in- 
struction-cache miss occurs, two clocks for data 
cache miss; one clock for hit. 

Not included in the table is the delay caused by a 
trap. This depends on the trap handler. 

In dual instruction mode, each pair of instructions 
requires the maximum of the times required by each 
individual instruction. 
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Instruction 


Execution 
Clocks 


Condition 


adds 






addu 






and 






andh 






andnot 






andnoth 






be 




If branch not taken. 




2 


If branch taken. 




+ 


If the prior instruction is addu, adds, subu, subs, pfeq, or pfgt. 


bet 




If branch taken. 




2 


If branch not taken. 




+ 1 


If the prior instruction is addu, adds, subu, subs, pfeq, or pfgt. 


bla 




If branch taken. 




2 


If branch not taken. 


bnc 




same as be) 


bnc.t 




same as bet) 


br 


1 




brl 


2 




bte 


1 


If branch not taken. 




3 


If branch taken. 


btne 




same as bte) 


call 


1 






+ 1 


If r1 referenced in next instruction. 




+ 1 + R1 


If data cache load miss in progress for a read of less than 1 28 bits. 




-M+R2 


If data cache load miss in progress for 1 28-bit read. 


calli 


2 






+ 1 


If r1 referenced in next instruction. 




+ 1+R1 


If data cache load miss in progress for a read of less than 1 28 bits. 




+ 1+R2 


If data cache load miss in progress for 1 28-bit read. 


fadd.p 


1 


( . . . and all other A-unit instructions except dual operations) 




<— >2..4 


If executed when a scalar floating-point operation (other than frcp 
or f rsqr) is in progress.* ) 
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. . .. Execution - .... 

Instruction ~, , Condition 

Clocks 



faddp 1 ( . . . and all other G-unit instructions except fiadd.w, fxfr) 

+ 1 If fdest is used by next instruction and next instruction is G-, M- or A-unit instruction 
< — ► 2..4 If executed when a scalar floating-point operation (other than frcp or frsqr) is in 
progress. (°) 

faddz (same as faddp) 

famov.r (same as fadd.p) 

fiadd.w 1 

+ 1 If fdest is used by next instruction and next instruction is M- or A-unit instruction 

(except when fiadd is used for fmov.dd or fmov.ss). 
4- 1 If fdest is used by next instruction and next instruction is G-unit instruction. 
< — ► 2. A If executed when a scalar floating-point operation (other than frcp or frsqr) is in 

progress.( c ) 

fisub.w (same as faddp) 

fix.v (same as fadd.p) 

fld.y 1 

+ 1 If this is the instruction after a st, fst or pst that hits the data cache. 

< — ► 2 If fdest is referenced in the next two instructions. 

+ 1+R1 If 32-bit fld.i or 64-bit fld.d misses the data cache. 

+ 1+R2 If 128-bit fid.q misses the data cache. 

+ 1 + RL If data cache load miss in progress (except in the following case). 

< — ► 2 XP: If this instruction follows a data cache access that misses in the virtual tags but 
hits in the physical tags. 

'+ 2 XP: If the prior instruction is a pf Id.y that hits a modified line in the data cache. 

+ R2 XP: If data-cache line write-back due to snoop is in progress. 

+ RN XR: If address path full.O) 

+ RL1 XP: If address path full.O) 

+ TLB If TLB miss. 

flush 1 

< — ► 3 XR: If preceded by another flush. 

< — ► 2 XP: If preceded by another flush. 

+ R2 XP: If data-cache line write-back due to snoop is in progress. 

+ 1 + RX If flush to modified line when store path full.(b) 

+ TLB If TLB miss. 

fmlow.dd 1 ( . . . and all other M-unit instruction except dual operations) 

+ 1 If fsrrf refers to result of the prior operation (either scalar or pipelined). 
+ 1 If the prior operation is a double-precision multiply. 
<— > 2. .4 If executed when a scalar floating-point operation (other than frcp or frsqr) is in 
progress.( c ) 



fmov.r 



fmov.ss and fmov.dd same as fiadd.w 
liTiov.Su and frnov.ds same as fadd.p 



fmul.p (same as fmlow.dd) 
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Instruction 



Execution 
Clocks 



Condition 



fnop 

form 

frcp.p 

frsqr.p 

fst.y 



fsub.p 

ftrunc.v 

fxfr 



fzchkl 
fzchks 
intovr 
ixfr 

Id.c 



(same as faddp) 

(same as f mlow.dd) 

(same as fmlow.dd) 

1 
+ 1 If followed by pipelined floating-point operation that overwrites the register 
being stored. 
+ 1+RL If data cache load miss in progress. 

+ 2 XP: If the prior instruction is a pfld.y that hits a modified line in the data cache. 
< — ► 2 XP: If this instruction follows a data cache access that misses in the virtual 
tags but hits in the physical tags. 
+ R2 XP: If data-cache line write-back due to snoop is in progress. 
< — > 2.. 4 If executed when a scalar floating-point operation (other than frcp or frsqr) is 
in progress.* ) 
+ RN XR: If address path full.(a) 
+ RL1 XP: If address path full.(a) 
+ 1 + RX If cache miss when store path full.( b ) 
+ TLB If TLB miss. 

(same as fadd.p) 

(same as fadd.p) 



1 

+ 1 
-M + R1 
+ 1+R2 
<— >2..4 



If idest referenced in next instruction. 

If data cache load miss in progress for 64-bit read. 

If data cache load miss in progress for 1 28-bit read. 

If executed when a scalar floating-point operation (other than frcp or frsqr) is 

in progress.* ) 

(same as faddp) 

(same as faddp) 



1 



1 

-M + R1 If data cache load miss in progress for 64-bit read. 

+ 1+ R2 If data cache load miss in progress for 1 28-bit read. 

< — ► 2 If fdest is referenced in the next two instructions. 

1 

+ 1 If idest referenced in next instruction. 

+ 1 + R1 If data cache load miss in progress for 64-bit read. 

+ 1 + R2 If data cache load miss in progress for 1 28-bit read. 



2-139 



int^L 



i860™ XP MICROPROCESSOR 



MUMDM^f 



Instruction 



Execution 
Clocks 



Condition 



Id.x 



1 

+ 1 

+ i 

+ 1+RL 
-> 1 +R1 



+ 2 

+ R2 

+ RN 

+ RL1 

+ 1+RX 

+ TLB 



If iciest referenced in next instruction. 

If this is the instruction after a st, f st or pst that hits the data cache. 

If data cache load miss in progress. 

If Id.x misses the data cache and a subsequent instruction references the 

iciest of the Id.x (except for following case). 

XP: If this instruction follows a data cache access that misses in the virtual 

tags but hits in the physical tags. 

XP: If the prior instruction is a pfld.y that hits a modified line in the data cache. 

XP: If data-cache line write-back due to snoop is in progress. 

XR: If address path full.(a) 

XP: If address path full.(a) 

If cache miss when store path full.( b ) 

IfTLBmiss. 



Idint.x 

Idio.x 

lock 

mov 

nop 

or 

orh 

pfadd.p 

pfaddp 

pfaddz 

pfam.p 



1 + OA 

1 + OA 

1 

1 

1 

1 

1 

(same as fadd.p) 

(same as faddp) 

(same as faddp) 

1 ( . . . and all other dual operations) 
+ 1 If fsrrf refers to result of the prior operation (either scalar or pipelined). 
+ 1 If the prior operation is a double-precision multiply. 
' 2.. 4 If executed when a scalar floating-point operation (other than f rep or frsqr) is 
in progress.(°) 



pfamov.r 


(same as fadd.p) 


pfeq.p 


(same as fadd.p) 


pfgt.p 


(same as fadd.p) 


pfiadd.w 


(same as faddp) 


pfisub.w 


(same as faddp) 


pfix.v 


(same as fadd.p) 
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Instruction 


Execution rv,„Hi«™ 
Clocks Cond.t.on 


pfld.y 


1 
+ 1+ RL If data cache load miss in progress. 
< — > 2 If fdest is referenced in the next two instructions. 
+ 1 + RL1 If three pfld's are outstanding. 
+ 2 + OA XR: If pf Id hits data cache. 

+ 2 XP: If the prior instruction is a pfld.y that hits a modified line in the 
data cache. 
< — > 2 XP: If this instruction follows a data cache access that misses in 
the virtual tags but hits in the physical tags. 
+ R2 XP: If data-cache line write-back due to snoop is in progress. 
+ RN XR: If address path full.(a) 
+ RL1 XP: If address path full.(a) 
+ TLB If TLB miss. 


pfle.p 


1 


pfmam.p 


(same as pfam.p) 


pfmov.r 


pf mov.ss and pfmov.dd same as faddp 
pf mov.sd and pf mov.ds same as fadd.p 


pfmsm.p 


(same as pfam.dd) 


pfmul.p 


(same as fmlow.dd) 


pfmul3.dd 


(same as fmlow.dd) 


pform 


(same as faddp) 


pfsm.p 


(same as pfam.dd) 


pfsub.p 


(same as fadd.p) 


pftrunc.v 


(same as fadd.p) 


pfzchkl 


(same as faddp) 


pfzchks 


(same as faddp) 


pst.d 


(same as fst.d) 


scyc.x 


1 + OA 


shl 




shr 




shra 




shrd 




st.c 


3 
+ 1 + R1 If data cache load miss in progress for a read of less than 1 28 bits. 
+ 1 + R2 If data cache load miss in progress for 1 28-bit read. 
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Instruction 


Execution 
Clocks 


Condition 


st.x 


1 






+ 1+RL 


If data cache load miss in progress. 




+ 2 


XP: If the prior instruction is a pfld.y that hits a modified line in the data cache. 




<— *2 


XP: If this instruction follows a data cache access that misses in the virtual 
tags but hits in the physical tags. 




+ R2 


XP: If data-cache line write-back due to snoop is in progress. 




+ RN 


XR: If address path full.(a) 




+ RL1 


XP: If address path full.(a) 




+ 1+RX 


If cache miss when store path full.( b ) 




+ TLB 


If TLB miss. 


stio.x 


1 + OA 




subs 


1 




subu 


1 




trap 


1 




unlock 


1 




xor 


, 1 




xorh 


1 





10.4 Instruction Characteristics 

The following table lists some of the characterisics 
of each instruction. The characteristics are: 

• What processing unit executes the instruction. 
The codes for processing units are: 

A Floating-point adder unit 

E Core execution unit 

G Graphics unit 

M Floating-point multiplier unit 

• Whether the instruction is pipelined or not. A P 
indicates that the instruction is pipelined. 

• Whether the instruction is a delayed branch in- 
struction. A D marks the delayed branches. 

• Whether execution is suppressed in user mode. 
An SU marks supervisor-only instructions. 

• Whether the instruction is available on both the 
i860 XR and i860 XP microprocessors. An XL 
marks instructions that are available only on the 
i860 XP microprocessor. 

• Whether the instruction changes the condition 
code CC. A CC marks those instructions that 
change CC. 

• Which faults can be caused by the instruction. 
The codes used for exceptions are: 

IT Instruction Fault 

SE Floating-Point Source Exception 



RE Floating-Point Result Exception, including 
overflow, underflow, inexact result 

DAT Data Access Fault 

Note that this is not the same as specifying at which 
instructions faults may be reported. A result excep- 
tion is reported on the subsequent floating-point in- 
struction, pst, fst, or sometimes fid, pfld, and ixfr. 

The instruction access fault IAT and the interrupt 
trap IN are not shown in the table because they can 
occur for any instruction. 

• Performance notes. These comments regarding 
optimum performance are recommendations only. 
If these recommendations are not followed, the 
i860 XP microprocessor automatically waits the 
necessary number of clocks to satisfy internal 
hardware requirements. The following notes de- 
fine the numeric codes that appear in the instruc- 
tion table: 

1 . The following instruction should not be a condi- 
tional branch (be, bnc, bet, or bnc.t). 

2. The destination should not be a source oper- 
and of the next two instructions. 

3. A load should not directly follow a store that is 
expected to hit in the data cache. 

4. When the prior instruction is scalar, fsrcj 
should not be the same as the fdest of the prior 
operation. 
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5. The fdest should not reference the destination 
of the next instruction if that instruction is a 
pipelined floating-point operation. 

6. The destination should not be a source oper- 
and of the next instruction. (For call and calli, 
the destination is r1.) 

7. When the prior operation is scalar and multipli- 
er op1 is fsrd, fsrc2 should not be the same as 
the fdest of the prior operation. 

8. When the prior operation is scalar, srd and 
src2 of the current operation should not be the 
same as dest of the prior operation. 

9. A pfld should not immediately follow a pfld. 

» Programming restrictions. These indicate combi- 
nations of conditions that must be avoided by pro- 
grammers, assemblers, and compilers. The fol- 
lowing notes define the alphabetic codes that 
appear in the instruction table: 

a. The sequential instruction following a delayed 
control-transfer instruction may not be another 
control-transfer instruction, nor a trap instruc- 
tion, nor the target of a control-transfer instruc- 
tion. 



b. When using a bri to return from a trap handler, 
programmers should take care to prevent traps 
from occurring on that or on the next sequen- 
tial instruction. IM should be zero (interrupts 
disabled) when the bri is executed. 

c. If fdest is not zero, fsrd must not be the same 
as fdest 

d. When fsrd goes to multiplier op1 or to KR or 
Kl, fsrd must not be the same as fdest. 

e. If dest is not zero, srd and src2 must not be 
the same as dest 

f. /srd must not be the same register as isrc2 for 
the autoincrementing form of this instruction. 

g. /srd must not be the same register as isrc2. 

h. flush must not be used in a locked sequence 
or in dual instruction mode. 




Instruction 


Execution 
Unit 


Pipelined? 

Delayed? 

Supervisor? 

i860TM XP Only? 


Sets 
CC? 


Faults 


Performance 
Notes 


Programming 
Restrictions 


adds 

addu 

and 

andh 

andnot 


E 
E 

E 
E 
E 




cc 

CC 

cc 
cc 
cc 




1 
1 




andnoth 

be 

bet 

bla 

bnc 


E 
E 
E 
E 
E 


D 
D 


cc 






a 
a,g 


bnc.t 

br 

bri 

bte 

btne 


E 
E 
E 
E 
E 


D 
D 
D 








a 

a 

a,b 


call 

calli 

fadd.p 

faddp 

faddz 


E 
^ E 
A 
G 
G 


D 
D 




SE,RE 


6 
6 

8 
8 


a 
a 


famov.r 

fiadd.w 

fisub.w 

fix.p 

fld.y 


A 
G 
G 
A 
E 






SE.RE 

SE.RE 
DAT 


8 
8 

2,3 


f 



NOTES: 

* On the i860 XP microprocessor, the pipelined instructions can generate ITR with PI. 
** On the i860 XR micropocessor, the 128-bit pfld.q is not available. If used it causes an instruction trap. 
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Instruction 


Execution 
Unit 


Pipelined? 

Delayed? 

Supervisor? 

i860TM XP Only? 


Sets 
CC? 


Faults 


Performance 
Notes 


Programming 
Restrictions 


flush 

fmlow.dd 

fmul.p 

form 

frcp.p 


E 
M 
M 
G 
M 






SE.RE 
SE,RE 


4 

4 , 
8 


h 


frsqr.p 

fst.y 

fsub.p 

ftrunc.p 

fxfr 


M 
E 
A 
A 
G 






SE.RE 

DAT 

SE.RE 

SE.RE 


5 
6,8 


f 


fzchkl 

fzchks 

Intovr 

ixfr 

Id.c 


G 
G 

E 
E 
E 






IT 


8 
8 

2 




Id.x 

Idint.x 

Idio.x 

lock 

or 

orh 


E 
E 
E 
E 
E 
E 


SU,XP 
SU,XP 


cc 

CC 


DAT 
DAT 
DAT 


6 




pfadd.p 
pfaddp 
pfaddz 
pfam.p 
pfamov.r 


A 
G 
G 
A&M 
A 


P 
p 
p 
p 
p 




SE.RE* 

* 

* 

SE.RE* 
SE,RE* 


8 
8 
7 


e 
e 
d 


pfeq.p 

pfgt.p 

pfiadd.w 
pfisub.w 
pfix.p 


A 
A 
G 
G 
A 


p 
p 
p. 
p 
p 


cc 
cc 


SE* 

SE* 

* 

* 
SE,RE* 


1 
1 
8 
8 


e 
e 


pfld.y 

pfmam.p 

pfmsm.p 

pfmul.p 

pfmul3.dd 


E 
A&M 
A&M 

M 

M 


P,(XP)** 

p 
p 
p 
p 




DAT* 
SE.RE* • 
SE.RE* 
SE,RE* 
SE.RE* 


2,9 
7 

7 
4 
4 


f 
d 
d 
c 
c 


pform 

pfsm.p 

pfsub.p 

pftrunc.p 

pfzchkl 


G 
A&M 
A 
A 
G 


p 
p 
p 
p 
p 




* 

SE.RE* 
SE.RE* 

SE.RE* 

* 


8 "" 
7 

8 


e 
d 



NOTES: 

* On the i860 XP microprocessor, the pipelined instructions can generate ITR with PI. 

** On the i860 XR micropocessor, the 128-bit pfld.q is not available. If used it causes an instruction trap. 



2-144 



inteh 



J860TM XP MICROPROCESSOR 



PftiyOMACW 



Instruction 


Execution 
Unit 


Pipelined? 

Delayed? 

Supervisor? 

"186OTM XP Only? 


Sets 
CC? 


Faults 


Performance 
Notes 


Programming 
Restrictions 


pfzchks 

pst.d 

scyc.x 

shl 

shr 


G 

E 
E 

E 
E 


P 
SU.XP 




* 

DAT 
DAT 


8 
5 


f 


shra 

shrd 

st.c 

st.x 

stio.x 


E 
E 

E 
E 

E 


SU,XP 




DAT 
DAT 






subs 

subu 

trap 

unlock 

xor 

xorh 


E 
E 
E 
E 
E 
E 




cc 

CC 

cc 
cc 


IT 


1 

1 






NOTES: 

*On the i860 XP microprocessor, the pipelined instructions can generate ITR with PI. 
**On the i860 XR micropocessor, the 128-bit pfld.q is not available. If used it causes an instruction trap. 



10.5 Software Compatibility 

10.5.1 REQUIRED CHANGES 

To port existing systems software from the i860 XR 
microprocessor to the i860 XP microprocessor, the 
following changes may be required. Applications 
software does not require changes. 

1 . Data cache flush. All four ways of the data cache 
must be flushed on the i860 XP microprocessor. 
The cache flush routine can be modified to check 
processor type in epsr or the DCS field of 
dirbase and flush the appropriate number of 
ways. 

2. Parity and bus error traps. If the i860 XP system 
signals these errors, the trap handler must be ex- 
tended to handle them. Software must avoid test- 
ing the BEF and PEF bits unless executing on the 
i860 XP microprocessor. 

3. LOCK# deactivation. On the i860 XP microproc- 
essor, traps do not automatically deactivate the 
LOCK# signal, so the trap handler must do a 
data access to deactivate LOCK #. Trap handlers 
that already access data soon after invocation do 
not require this modification. 

4. Load pipe precision. The precision of the last 
stage of the load pipeline is specified by the LRP 
bit on the i860 XR microprocessor but by the 
LRPO and LRP1 bits on the i860 XP microproces- 



sor. The procedure that restores the load pipe 
must check the processor type, use the appropri- 
ate bits, and restore the correct precision. Pipe 
restoration code for the i860 XR microprocessor 
will work correctly on the i860 XP microprocessor 
if pfld.q is not used. 

5. Pre-accessed trap handler pages. Page-directory 
and page-table entries for the instruction pages 
of the trap handler and for the first data page 
accessed by the trap handler must always have 
A = 1 . Software modified to allocate page tables 
this way works on both i860 XR and i860 XP mi- 
croprocessors. 

6. Page directory entry bit 7 must be zero. This is 
the bit that selects four Mbyte or four Kbyte page 
size. On the i860 XR microprocessor, it is re- 
served and should be set to zero. It must be set 
to zero for four Kbyte pages to work on the 
i860 XP microprocessor. 

10.5.2 PERFORMANCE OPTIMIZATIONS 

Software developers may wish to make the following 
performance enhancements in systems software for 
the i860 XP microprocessor. Systems software that 
must execute on both i860 XP and i860 XR systems 
can contain code both with and without the optimiza- 
tions. By testing the processor type, the appropriate 
instruction path can be determined. 
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1. Data cache flush. On the i860 XP microproces- 
sor, a complete flushing of the data cache is not 
needed when changing context or marking a 
page not present. 

2. The epsr bits Al, Dl, PI, and PT can be used on 
the i860 XP microprocessor to make trap han- 
dlers more efficient. 

3. Four-Mbyte pages can be allocated to frame buff- 
ers and the operating-system kernel, thereby re- 
ducing the cost of TLB misses. 

10.5.3 NEW FEATURES 

Software that uses the new features available only 
on the i860 XP microprocessor will not be compati- 
ble with the i860 XR microprocessor unless alter- 
nate instruction paths are provided. 

Systems software features: 

1 . New instructions Idio, stio, Idint, and scyc. 

2. Four-Mbyte pages. 

3. Privileged Registers pO, p1, p2, and p3. 

4. Concurrency control unit. 

5. 128-bit load instruction pfld.q. 

6. Support for virtual address aliases. 

Applications software features: 

1 . Concurrency control unit. 

2. 128-bit load instruction pfld.q. The i860 XR mi- 
croprocessor traps on pfld.q; therefore, software 
has the opportunity to emulate a pfld.q with two 
pf Id.d instructions. However, this strategy does 
not yield optimal performance on the i860 XR mi- 
croprocessor. 

10.5.4 NOTES 

On the i860 XP microprocessor, pages with WT = 1 
are cached with the write-through policy; whereas, 
on the i860 XR microprocessor, they are not cached 
at all. Because this change in the function of WT 
was anticipated in the i860 XR microprocessor docu- 
mentation, no incompatibility should arise. 



11.0 REVISION HISTORY 

DATA SHEET REVISION REVIEW 

The following list represents the major differences 
between version 002 and version 001 of the i860 XP 
Microprocessor Data Sheet. 

Section 2.2.4 Al bit has been changed to TAI in 
Figure 2.5. The explanation for PI 
bit has been expanded. 

Section 4.2.33 PCHK# signal description has 
been expanded. 

Section 4.2.35 Output buffer configuration has 
been added in PEN# signal de- 
scription. 

Section 4.2.37 RESET description has been ex- 
panded. 

Section 5.1.3 Table 5.2 has been corrected. 
The explanation of write/read and 
read/write pipelining has been re- 
vised. 

Section 5.2.2.4-5 The explanation of late back-off^ 
mode has been expanded. 

Section 5.2.4 Figure 5.27 has been corrected. 

Section 5.3.4 The explanation of EWBE# tim- 
ing has been corrected. 

Section 5.5 RESET initialization description 

has been expanded. 

Section 9.2 D.C. Characteristics are correct- 

ed. 

Section 9.3 AC. Characteristics are replaced 

with nominal timings based on 
C L = OpF. 

Figure 9.3 and Figure 9.4 have 
been replaced with nominal A.C. 
timings based on Cl = pF. 

Figure 9.5 has been corrected for 
normal and high-current output 
buffers. 

Section 9.4 Component buffer model has 

been added. 

Section 10.4 Programming restriction on flush 
instruction has been added. 
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8-bit pixel 

data type, 2.1.4 

1 6-bit pixel 

datatype, 2.1.4 

16-bit values 

alignment requirements, 2.3 

32-bit binary floating-point 
single-precision real, 2.1.3 

32-bit integer 
datatype, 2.1.1 

32-bit ordinal 
datatype, 2.1.2 

32-bit pixel 

data type, 2.1.4 

32-bit values 

alignment requirements, 2.3 

64-bit binary floating-point 
double-precision real, 2.1.3 
floating-point register file, 2.2.2 

64-bit integer 
datatype, 2.1.1 
floating-point register file, 2.2.2 

64-bit values 

alignment requirements, 2.3 

1 28-bit load and store instructions 
floating-point register file, 2.2.2 

128-bit values 

alignment requirements, 2.3 

82495XP/82490XP cache 
BRDY# (burst ready), 4.2.7 
external secondary cache, 1 .0 
write-once policy, 3.2.4.2 

A3 1 -A3 (address pins) 
signal description, 4.2.1 

A (accessed) 

page-table entries (PTEs), 2.4.4.6 



AA 

fsr U-bit (update bit), 2.2.8 

access rights 

address translation caches, 3.1 

A.C. characteristics 
electrical data, 9.3 

addressing 

i860 XP microprocessor, 2.3 
modes, 2.7 

address space 
consistency, 3.3.1 

address translation 
algorithm, 2.4.5 
caches', 3.1 
faults, 2.4.6 
P (present) bit, 2.4.4.2 
virtual addressing, 2.4 

adds (Add Signed) 

epsr OF (overflow flag), 2.2.4 
instruction definition, 10.1 
instruction timing, 10.3 

addu (Add Unsigned) 

epsr OF (overflow flag), 2.2.4 
instruction definition, 10.1 
instruction timing,. 10.3 

ADS# (address status) 

AHOLD (address hold), 4.2.3 
signal description, 4.2.2 

AE 

fsr U-bit (update bit), 2.2.8 

AHOLD (address hold) 
bus arbitration, 5.2 
signal description, 4.2.3 

algorithm 

address translation, 2.4.5 
cache replacement, 3.2.3 

aliasing 

instruction cache, 3.2.2 

internal instruction and data caches, 3.2 
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alignment 

requirements, 2.3 

andh (Logical AND High) 
instruction definition, 10.1 
instruction timing, 10.3 

and (Logical AND) 

instruction definition, 10.1 
instruction timing, 10.3 

andnoth (Logical AND NOT High) 
instruction definition, 10.1 
instruction timing, 10.3 

andnot (Logical AND NOT) 
instruction definition, 10.1 
instruction timing, 10.3 

ANSI/IEEE Standard, 754 to 1985, 1.0 

AO 

fsr U-bit (update bit), 2.2.8 

arbitration 

bus operation, 5.2 
HOLD and HLDA, 5.2.1 

ATE (address translation enable) 
address translation, 2.4 
dirbase format description, 2.2.6 

AU 

fsr U-bit (update bit), 2.2.8 

B 

back-off 

bus cycle, 5.2.2 
late modes, 5.2.2.3 
one-clock late mode, 5.2.2.4 
two-clock late mode, 5.2.2.5 

be (Branch on CC) 

instruction definition, 10.1 
instruction timing, 10.3 

bet (Branch on CC, Taken) 
instruction definition, 10.1 
instruction timing, 10.3 



BE7#-BE0# (byte enables) 
signal description, 4.2.4 

bear (bus error address register) 
format description, 2.2.10 

BE (big endian) 
data cache, 3.2.1 
epsr format description, 2.2.4 

BEF (bus error flag) 

epsr format description, 2.2.4 

BEn# 

BE7#-BE0# (byte enables), 4.2.4 

BERR (bus error) 

bear (bus error address register), 2.2.10 

bus error trap, 2.8.7 

epsr BEF (bus error flag), 2.2.4 

psr IM (interupt mode), 2.2.3 

signal description, 4.2.5 

big endian mode 
addressing, 2.3 

bla (Branch on LCC and Add) 

epsr Al (trap on autoincrement instruction), 2.2.4 
instruction definition, 10.1 
instruction timing, 10.3 

BL (bus lock) 

dirbase format description, 2.2.6 

bnc (Branch on Not CC) 
instruction definition, 10.1 
instruction timing, 10.3 

bnc.t (Branch on Not CC, Taken) 
instruction definition, 10.1 
instruction timing, 10.3 

BOFF# (back-off) 

ADS # (address status), 4.2.2 

BERR (bus error), 4.2.5 

bus arbitration, 5.2 

dirbase LB (late back-off mode), 2.2.6 

FLINE# choice, 5.3.5.1 

signal description, 4.2.6 
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boundary scan 

register cell ordering, 6.5 

BPR (bypass register) 
test, 6.2 

br (Branch Direct Unconditionally) 
instruction definition, 10.1 
instruction timing, 10.3 

BR (break read) 

debugging i860 XP microprocessor, 2.9 
psr format description, 2.2.3 

BRDY# (burst ready) 

bear (bus error address register), 2.2.10 

BERR (bus error), 4.2.5 

epsr IL (interlock), 2.2.4 

locked access, 3.2.4.3 

signal description, 4.2.7 

write-once policy, 3.2.4.2 

BREQ (bus request) 
signal description, 4.2.8 

bri (Branch Indirect Unconditionally) 
instruction definition, 10.1 

brl (Branch Indirect Unconditionally) 
instruction timing, 10.3 

BS (bus or parity error trap in supervisory mode) 
epsr format description, 2.2.4 

BSR (boundary scan register) 
test, 6.2 

bte (Branch If Equal) 

instruction definition, 10.1 
instruction timing, 10.3 

btne (Branch If Not Equal) 
instruction timing, 10.3 

buffer 

models, 9.4 

size, selection with PEN#, 4.2.35, 5.5, 9.4.3 

burst cycles 
bus cycle, 5.1.2 

bus arbitration 
bus operation, 5.2 



bus and cache control unit 
function of, 1 .0 

bus cycles 

back-off and restart, 5.2.2 
bus operation, 5.1 
type output pins, 4.1 

bus errors 

bear (bus error address register), 2.2.10 
trap, 2.8.7 

bus operation 

i860 XP microprocessor, 5.0 

BW (break write) 

debugging i860 XP microprocessor, 2.9 
psr format description, 2.2.3 

BYPASS # (bypass) 
signal description, 4.2.9 
TAP encoding, 6.3 



CACHE # (cacheability) 

BE7#-BE0# (byte enables), 4.2.4 
signal description, 4.2.10 

cache 

address translation, 3.1 
consistency protocol, 3.2.4 
external secondary, 1.0 
inquiry cycles (snooping), 5.3 
internal instruction and data, 3.2 
invalidating entries, 3.3 
on-chip, 3.0 
replacement algorithm, 3.2.3 

cacheability 

address translation caches, 3.1 
consistency, 3.3.4 

calli (Indirect Subroutine Call) 
instruction definition, 10.1 
instruction timing, 10.3 

call (Subroutine Call) 

instruction definition, 10.1 
instruction timing, 10.3 

capture-DR 

test state, 6.4.5 
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capture- 1 R 

test state, 6.4.11 

CC (condition code) 

psr format description, 2.2.3 

ccr (concurrency control register) 
DCCU initialization, 2.5.1 
format description, 2.2.12 

CCUBASE 

ccr (concurrency control register), 2.2.12 
DCCU addressing, 2.5.2 
DCCU initialization, 2.5.1 

CD (cache disable) 

bypassing instruction and data cache, 3.3 
page-table entries (PTEs), 2.4.4.5 

CLK (clock) 

signal description, 4.2.11 

CO (CCU on) 

ccr (concurrency control register), 2.2.12 

color intensity shading 
pixel formats, 2.1.4 

compatibility 

pipelined cycles, 5.1.3 
software changes, 10.5.1 

concurrency control unit (CCU) 

ccr (concurrency control register), 2.2.12 
detached CCU, 2.5 
NEWCURR register, 2.2.13 

consistency 

address space, 3.3.1 
cacheability, 3.3.4 
instruction cache, 3.3.2 
internal cache, 3.3 
load pipe, 3.3.5 
page table, 3.3.3 
protocol, 3.2.4 
write-once policy, 3.2.4.2 

control registers 
register set, 2.2 



copy-back policy 

data cache update, 3.2.1.1 

core execution unit 
function of, 1 .0 

CS8 (code size 8-bit) 

BE7#-BE0# (byte enables), 4.2.4 
dirbase format description, 2.2.6 

CTRL-format 

instructions, 10.2.2 

CTYP (cycle type) 

signal description, 4.2.12 

current mode 

high vs. normal, 4.2.35, 5.5, 9.3, 9.4.3 

cycles 

back-off, 5.2.2.1 
burst cycles, 5.1.2 
interrupt acknowledge, 5.1.4 
pipelined, 5.1.3 
restart, 5.2.2.2 
special bus, 5.1.5 



D63-D0 (data pins) 

signal description, 4.2.14 

data access 
fault, 2.8.5 

data cache 
bypassing, 3.3 
flushing, 3.3 
function of, 1.0 
operation, 3.2 
organization, 3.2.1 
states, 3.2.4.1 
update policies, 3.2.1.1 

data types 

i860 XP microprocessor, 2.1 

DAT (data access trap) 

debugging i860 XP microprocessor, 2.9 
psr format description, 2.2.3 
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db (data breakpoint register) 

debugging i860 XP microprocessor, 2.9 

format description, 2.2.5 

psr BR (break read) and BW (break write), 2.2.3 

Dbit 

dual-instruction mode, 2.6.2 

D/C# (data/code) 

signal description, 4.2.13 

D.C. characteristics 
electrical data, 9.2 

DCCU (detached concurrency control unit) 
addressing, 2.5.2 

ccr (concurrency control register), 2.2.12 
function of, 1 .0 
initialization, 2.5.1 
internals, 2.5.3 

DCS (data cache size) 

epsr format description, 2.2.4 

D (dirty) 

page-table entries (PTEs), 2.4.4.6 

debugging 

i860 XP microprocessor, 2.9 

deferred-write policy 

data cache update, 3.2.1.1 

denormal 

special floating-point values, 2.1.3 

Detached 

STAT register description, 2.2.14 

detached CCU 

i860 XP microprocessor, 2.5 

d.fnop 

dual-instruction mode, 2.6.2 

, DID (device identification register) 
test, 6.2 

DIR 

virtual address, 2.4.2 



dirbase (directory base register) 
address space consistency, 3.3.1 
cache replacement algorithm, 3.2.3 
DCCU initialization, 2.5.1 
format description, 2.2.6 
instruction cache consistency, 3.3.2 
page directory, 2.4.3 
page table consistency, 3.3.3 
P (present) bit, 2.4.4.2 

disassemblers 

big endian mode, 2.3 

Dl (trap on delayed instruction) 
epsr format description, 2.2.4 

DM (dual instruction mode) 
psr format description, 2.2.3 

DO (detached only) 

ccr (concurrency control register), 2.2.12 

double-precision real 
datatype, 2.1.3 

double real value 

floating-point registers, 2.1.3 

double-shift instruction 
psr SC (shift count), 2.2.3 

DP7-DP0 (data parity) 
signal description, 4.2.15 

DPC (data-path control) 

dual-operation instructions, 2.6.3 

DPS (DRAM page size) 

dirbase format description, 2.2.6 

DS (delayed switch) 

psr format description, 2.2.3 

DTB (directory table base) 

dirbase format description, 2.2.6 

dual-instruction mode 
parallellism, 2.6.2 

dual-operation instructions 
floating-point, 2.6.3 
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E 

EADS# 

AHOLD (address hold), 4.2.3 

EADS# (external address status) 
signal description, 4.2.16 

epsr (extended processor status register) 
data cache, 3.2.1 
DCCU internals, 2.5.3 
format description, 2.2.4 
page-table entries (PTEs), 2.4.4.3 

EWBE# (external write buffer empty) 
epsr SO (strong ordering), 2.2.4 
signal description, 4.2.17 

exit1-DR 

test state, 6.4.7 

exitUR 

test state, 6.4.13 

exit2-DR 

test state, 6.4.9 

exit2-IR 

test state, 6.4.15 

EXTEST 

TAP encoding, 6.3 



f addp (Add with Pixel Merge) 
instruction definition, 10.1 
instruction timing, 10.3 

fadd.p (Floating-Point Add) 
instruction definition, 10.1 
instruction timing, 10.3 

faddz (Add with Z Merge) 
instruction definition, 10.1 
instruction timing, 10.3 

famov.r (Floating-Point Adder Move) 
instruction definition, 10.1 
instruction timing, 10.3 



fault 

address translation, 2.4.6 
data access, 2.8.5 
floating-point, 2.8.3 
instruction access, 2.8.4 
result exception fault, 2.8.3.1 
source exception fault, 2.8.3.1 

fiadd.w (Long-Integer Add) 
instruction definition, 10.1 
instruction timing, 10.3 

fir (fault instruction register) 

epsr Dl (trap on delayed instruction), 2.2.4 
format description, 2.2.7 

fisub.w (Long-Integer Subtract) 
instruction definition, 10.1 
instruction timing, 10.3 

fix.v (Floating-Point to Integer Conversion) 
instruction definition, 10.1 
instruction timing, 10.3 

fld.y (Floating-Point Load) 
instruction definition, 10.1 
instruction timing, 10.3 

FLINE# (flush line) 

BOFF# choice, 5.3.5.1 
signal description, 4.2.18 

floating-point 
adder, 1.0 
control unit, 1 .0 
fault, 2.8.3 

instruction encoding, 10.2.3 
multiplier, 1.0 
register file, 2.2.2 

flush (Cache Flush) 

cache replacement algorithm, 3.2.3 
dirbase RB (replacement block), 2.2.6 
flushing data cache, 3.3 
instruction definition, 10.1 
instruction timing, 10.3 
requirements summary, 3.3.6 
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fmlow.dd (Floating-Point Multiply Low) 
instruction definition, 10.1 
instruction timing, 10.3 

fmov.r (Floating-Point Reg-Reg Move) 
instruction definition, 10.1 
instruction timing, 10.3 

fmul.p (Floating-Point Multiply) 

instruction definition, 10.1 
instruction timing, 10.3 

fnop (Floating-Point No Operation) 
instruction definition, 10.1 
instruction timing, 10.3 

form (OR with MERGE Register) 
instruction definition, 10.1 
instruction timing, 10.3 

f rcp.p (Floating-Point Reciprocal) 
instruction definition, 10.1 
instruction timing, 10.3 

f rsqr.p (Floating-Point Reciprocal Square Root) 
instruction definition, 10.1 
instruction timing, 10.3 

fsr (floating-point status register) 
format description, 2.2.8 
pipelining status information, 2.6.1.2 

fst.y (Floating-Point Store) 
instruction definition, 10.1 
instruction timing, 10.3 

fsub.p (Floating-Point Subtract) 
instruction definition, 10.1 
instruction timing, 10.3 

FTE (floating-point trap enable) 
fsr format description, 2.2.8 



FT (floating-point trap) 

psr format description, 2.2.3 

ftrunc.v (Floating-Point to Integer Conversion) 
instruction definition, 10.1 
instruction timing, 10.3 

fxfr (Transfer F-P to Integer Register) 
instruction definition, 10.1 
instruction timing, 10.3 

fzchkl (32-Bit Z-Buffer Check) 
instruction definition, 10.1 
instruction timing, 10.3 

fzchks (1 6-Bit Z-Buffer Check) 
instruction definition, 10.1 
instruction timing, 10.3 

FZ (flush zero) 

fsr format description, 2.2.8 



graphics unit 
function of, 1.0 

H 

hardware interface 

i860 XP microprocessor, 4.0 

HIT# (cache inquiry hit) 
signal description, 4.2.19 

HITM# (hit modified line) 

internal cache consistency, 3.3 
signal description, 4.2.20 

HLDA (bus hold acknowledge) 
signal description, 4.2.21 

HOLD (bus hold) 
bus arbitration, 5.2 
signal description, 4.2.22 
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i 

i860 XP microprocessor 
bus operation, 5.0 
functional description, 1 .0 
hardware interface, 4.0 
instruction set, 8.0 
mechanical data, 7.0 
on-chip caches, 3.0 
programming interface, 2.0 
testability, 6.0 

IAT (instruction access trap) 
psr format description, 2.2.3 

IDCODE 

TAP encoding, 6.3 

IEEE Standard 

for Binary Floating-Point Arithmetic, 1 .0 
P1 149.1 /D6 testability, 6.0 

IL (interlock) 

epsr format description, 2.2.4 

IM (interrupt mode) 

psr format description, 2.2.3 

indefinite 

special floating-point values, 2.1.3 

inexact result 

result exception fault, 2.8.3.2 

initialization 
at RESET, 5.5 

infinity 

special floating-point values, 2.1.3 

IN (interrupt) 

psr format description, 2.2.3 

InLoop 

STAT register description, 2.2.14 

inquiry cycles 

data cache states, 3.2.4.1 
for line being cached, 5.3.2.1 
for line being replaced, 5.3.2.2 
snooping, 5.3 
write-back, 5.3.1 



instruction 

access fault, 2.8.4 
characteristics, 10.4 
CTRL-format, 10.2.2 
definitions, 10.1 
dual-operation, 2.6.3 
encoding floating-point, 1 0.2.3 
fault, 2.8.2 

format and encoding, 10.2 
REG-format, 10.2.1 
timing, 10.3 

instruction cache 
bypassing, 3.3 
consistency, 3.3.2 
function of, 1.0 
operation, 3.2 
organization, 3.2.2 

instruction set 

abbreviations, 1 0.0 
extensions of i860 XR, 2.6 
i860 XP microprocessor, 8.0 

INT/CS8 (interrupt/code-size 8-bits) 
signal description, 4.2.24 

integer 

datatype, 2.1.1 
register file, 2.2.1 

internal cache 
consistency, 3.3 

interrupt 

acknowledge cycles, 5.1.4 
i860 XP microprocessor, 2.8 
trap, 2.8.8 

INT (interrupt) 

epsr format description, 2.2.4 

intovr (Software Trap on Integer Overflow) 
instruction definition, 10.1 
instruction timing, 10.3 

INT pin 

epsr INT (interrupt), 2.2.4 
psr IM (interrupt mode), 2.2.3 
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invalidation requirements 
summary, 3.3.6 

INV (invalidate) 

signal description, 4.2.23 

IR (instruction register) 
test, 6.3 

IRP (integer graphics) 

fsr format description, 2.2.8 

ITI (cache and TLB invalidate) 
dirbase format description, 2.2.6 

IT (instruction trap) 

psr format description, 2.2.3 

ixfr (Transfer Integer to F-P Register) 
instruction definition, 10.1 
instruction timing, 10.3 

K 

KBO, KB1 (cache block) 
signal description, 4.2.25 

KEN # (cache enable) 

BE7#-BE0# (byte enables), 4.2.4 

bypassing instruction and data cache, 3.3 

DCCU addressing, 2.5.2 

internal instruction and data caches, 3.2 

locked access, 3.2.4.3 

signal description, 4.2.26 

Kl 

special purpose register description, 2.2.9 

KNF (kill next floating-point instruction) 
psr format description, 2.2.3 



KR 



special purpose register description, 2.2.9 



LB (late back-off mode) 

dirbase format description, 2.2.6 

LCC (loop condition code) 

psr CC (condition code), 2.2.3 



Id.c (Load from Control Register) 
fir (fault instruction register), 2.2.7 
instruction definition, 10.1 
instruction timing, 10.3 

Idint.x (Load Interrupt Vector) 
big endian mode, 2.3 
epsr BE (big endian), 2.2.4 
extensions of i860 XR, 2.6 
instruction definition, 10.1 
instruction timing, 10.3 

Idio.x (Load I/O) 

big endian mode, 2.3 
extensions of i860 XR, 2.6 
instruction definition, 10.1 
instruction timing, 10.3 

id.l 

flushing data cache, 3.3 

Id.x (Load Integer) 
DCCU internals, 2.5.3 
instruction definition, 10.1 
instruction timing, 10.3 

LEN (data length) 

signal description, 4.2.27 

LFBSR (linear feedback shift register) 
cache replacement algorithm, 3.2.3 

little endian mode 
addressing, 2.3 

load pipe 

consistency, 3.3.5 

LOCK# (address lock) 
A (accessed) bit, 2.4.4.6 
cycle attribute, 5.4 
dirbase BL (bus lock), 2.2.6 
signal description, 4.2.28 

lock (Begin Interlocked Sequence) 
dirbase BL (bus lock), 2.2.6 
instruction definition, 10.1 
instruction timing, 10.3 
locked access, 3.2.4.3 



2-155 



snt^f 



186OTM XP MICROPROCESSOR 



[PilUMMMf 



locked access 

cache consistency, 3.2.4.3 

lock instruction 

epsr IL (interlock), 2.2.4 

lock protocol 

instruction fault, 2.8.2.1 

LRPO (load pipe result precision) 
fsr format description, 2.2.8 

LRP1 (load pipe result precision) 
fsr format description, 2.2.8 

M 

MA 

fsr U-bit (update bit), 2.2.8 

mechanical data 

i860 XP microprocessor, 7.0 

MERGE 

special purpose register description, 2.2.9 

MESI 

cache consistency protocol, 3.2.4 
write cycle reordering, 5.3.3 

Ml 

fsr U-bit (update bit), 2.2.8 

M/IO# (memory-l/O) 

' signal description, 4.2.29 

MO 

fsr U-bit (update bit), 2.2.8 

mov (Constant-to-Register Move) 
instruction definition, 10.1 

mov (Register-Register Move) 
instruction definition, 10:1 
instruction timing, 10.3 



MU 



fsr U-bit (update bit), 2.2.8 



N 

NA# (next address request) 
locked access, 3.2.4.3 
signal description, 4.2.30 
write-once policy, 3.2.4.2 

NaN (Not a Number) 

special floating-point values, 2.1.3 

NENE# (next near) 

dirbase DPS (DRAM page size), 2.2.6 
signal description, 4.2.31 

Nested 

STAT register description, 2.2.14 

NEWCURR register 
DCCU internals, 2.5.3 
format description, 2.2.13 

nonpipelined cycle 
bus cycle, 5.1.3 

nop (Core-Unit No Operation) 
instruction definition, 10.1 
instruction timing, 10.3 



offset 

addressing modes, 2.7 , 
virtual address, 2.4.2 • 

OF (overflow flag) 

epsr format description, 2.2.4 

on-chip caches 

i860 XP microprocessor, 3.0 

ordinal 

datatype, 2.1.2 

orh (Logical OR High) 
instruction definition, 10.1 
instruction timing, 10.3 

or (Logical OR) 

instruction definition, 10.1 
instruction timing, 1 0.3 
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output pins 

pins overview, 4.1 

overflow 

result exception fault, 2.8.3.2 



package 

thermal specifications, 8.0 

PAGE 

virtual address, 2.4.2 

page directory 

little endian mode, 2.3 
page tables, 2.4.3 

paged virtual-address space 
addressing, 2.3 

page frame 

address, 2.4.4.1 

physical main memory, 2.4.1 

page table 

combining protection, 2.4.4.8 

consistency, 3.3.3 

entry format description, 2.4.4 

format description, 2.4.3 

little endian mode, 2.3 

for trap handlers, 2.4.4.7 

paging unit 

address translation caches, 3.1 
function of, 1 .0 

parallelism 

dual-instruction mode, 2.6.2 
use of, 2.6 

parity error 

bear (bus error address register), 2.2.10 
psr IM (interrupt mode), 2.2.3 
trap, 2.8.6 

pause-DR 

test state, 6.4.8 

pause-IR 

test state, 6.4.14 



PBM (page-table bit mode) 
epsr format description, 2.2.4 

PCD (page cache disable) 

bypassing instruction and data cache, 3.3 
CD (cache disable), 2.4.4.5 
signal description, 4.2.32 

PCHK# (parity check) 
signal description, 4.2.33 

PCYC (page cycle) 

signal description, 4.2.34 

PEF (parity error flag) 

epsr format description, 2.2.4 

PEN# (parity enable) 

bear (bus error address register), 2.2.10 
parity error trap, 2.8.6 
signal description, 4.2.35 

performance optimizations 
software compatibility, 10.5.2 

pfaddp (Pipelined Add with Pixel Merge) 
instruction definition, 10.1 
instruction timing, 10.3 

pfadd.p (Pipelined Floating-Point Add) 
instruction definition, 10.1 
instruction timing, 10.3 

pfaddz (Pipelined Add with Z Merge) 
instruction definition, 10.1 
instruction timing, 10.3 

pfamov.r (Pipelined Floating-Point Adder Move) 
instruction definition, 10.1 
instruction timing, 10.3 

pfam.p (Pipelined Floating-Point Add and Multiply) 
dual-operation, 2.6.3 
instruction definition, 10.1 
instruction timing, 10.3 
special purpose registers, 2.2.9 

pfeq.p (Pipelined Floating-Point Equal Compare) 
instruction definition, 10.1 
instruction timing, 10.3 
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pfgt.p (Pipelined Floating-Point Greater-Than 
Compare) 

instruction definition, 10.1 

instruction timing, 10.3 

pfiadd.w (Pipelined Long-Integer Add) 
instruction definition, 10.1 
instruction timing, 10.3 

pfisub.w (Pipelined Long-Integer Subtract) 
instruction definition, 10.1 
instruction timing, 10.3 

pfix.v (Pipelined Floating-Point to Integer 
Conversion) 

instruction definition, 10.1 

instruction timing, 10.3 

pfld (Pipelined Floating-Point Load) 
epsr PT (trap on pipeline use), 2.2.4 
load pipe consistency, 3.3.5 
pipeline loads, 2.6.1.5 

pfld.q 

extensions of i860 XR, 2.6 

pfld.y (Pipelined Floating-Point Load) 
instruction definition, 10.1 
instruction timing, 10.3 

pfle.p (Pipelined F-P Less-Than or Equal Compare) 
instruction definition, 10.1 
instruction timing, 10.3 

pfmam.p (Pipelined Floating-Point Add and Multiply) 
dual operation, 2.6.3 
instruction definition, 10.1 
instruction timing, 10.3 
special purpose registers, 2.2.9 

pfmov.r (Pipelined Floating-Point Reg-Reg Move) 
instruction definition, 10.1 
instruction timing, 10.3 

pfmsm.p (Pipelined Floating-Point Subtract 
and Multiply) 

dual operation, 2.6.3 

instruction definition, 10.1 

instruction timing, 10.3 

special purpose registers, 2.2.9 



pfmul3.dd (Three-Stage Pipelined Multiply 
instruction definition, 10.1 
instruction timing, 10.3 

pfmul.p (Pipelined Floating-Point Multiply) 
instruction definition, 10.1 
instruction timing, 10.3 

pform (Pipelined OR to MERGE Register) 
instruction definition, 10.1 
instruction timing, 10.3 

pfsm.p (Pipelined Floating-Point Subtract 
and Multiply) 

dual-operation, 2.6.3 

instruction definition, 10.1 

instruction timing, 10.3 

special purpose registers, 2.2.9 

pfsub.p (Pipelined Floating-Point Subtract) 
instruction definition, 10.1 
instruction timing, 10.3 

pftrunc.v (Pipelined Floating-Point to 
Integer Conversion) 

instruction definition, 10.1 

instruction timing, 10.3 

pfzchkl (Pipelined 32-Bit Z-Buffer Check) 
instruction definition, 10.1 
instruction timing, 10.3 

pfzchks (Pipelined 16-Bit Z-Buffer Check) 
instruction definition, 10.1 
instruction timing, 10.3 

physical main memory 
page frame, 2.4.1 

physical tags 

internal instruction and data caches, 3.2 

PI bit 

using, 2.8.2.2 

PIM (previous interrupt mode) 
psr format description, 2.2.3 

pins overview 

hardware interface, 4.1 
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pipeline 

cycles, 5.1.3 
loads, 2.6.1.5 
operations, 2.6.1 
precision in, 2.6.1.3 
scalar transition, 2.6.1.4 
status information, 2.6.1.2 

PI (pipeline instruction) 

epsr format description, 2.2.4 

pixel 

data type, 2.1.4 

PM (pixel mask) 

psr format description, 2.2.3 

P (present) 

page-table entries (PTEs), 2.4.4.2 

privileged registers 

format description, 2.2.11 

processor 

revisions, 2.2.4 
type, 2.2.4 

programming interface 

i860 XP microprocessor, 2.0 

PS (pixel size) 

psr format description, 2.2.3 

psr (processor status register) 

debugging i860 XP microprocessor, 2.9 
format description, 2.2.3 
page-table entries (PTEs), 2.4.4.3 

pst.d (Pixel Store) 

instruction definition, 10.1 

instruction timing, 10.3 

psr PS (pixel size) and PM (pixel mask), 2.2.3 

PT (trap on pipeline use) 

epsr format description, 2.2.4 
using, 2.8.2.2 

PU (previous user mode) 
psr format description, 2.2.3 



PWT (page write-through) 
signal description, 4.2.36 
WT (write-through), 2.4.4.4 



ratings 

absolute maximum, 9.1 

RB (replacement block) 

dirbase format description, 2.2.6 

RC (replacement control) 

dirbase format description, 2.2.6 

REG-format 

instructions, 10.2.1 

register cell ordering 
boundary scan, 6.5 

replacement algorithm 
cache, 3.2.3 

RESET (system reset) 

AHOLD (address hold), 4.2.3 

bear (bus error address register), 2.2.10 

cache replacement algorithm, 3.2.3 

epsr BEF (bus error flag), 2.2.4 

epsr SO (strong ordering), 2.2.4 

initialization, 5.5 

signal description, 4.2.37 

trap, 2.8.9 

restart 

bus cycle, 5.2.2 

result exception fault 
floating-point, 2.8.3.1 

right-shift instruction 

psr SC (shift count), 2.2.3 

RM (rounding mode) 

fsr format description, 2.2.8 

RR (result register) 

fsr format description, 2.2.8 

run-test/idle 
test state, 6.4.2 
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SAMPLE 

TAP encoding, 6.3 

scalar 

mode, 2.6.1.1 
operations, 2.6.1 
pipelined transition, 2.6.1.4 

SC (shift count) 

psr format description, 2.2.3 

scyc.x (Special Cycles) 
big endian mode, 2.3 
epsr BE (big endian), 2.2.4 
extensions of i860 XR, 2.6 
instruction definition, 10.1 
instruction timing, 10.3 

select-DR-scan 
test state, 6.4.3 

select-IR-scan 
test state, 6.4.4 

serializing 

locked access, 3.2.4.3 

SE (source exception) 

fsr format description, 2.2.8 

shift-DR 

test state, 6.4.6 

shift-IR 

test state, 6.4.12 

shl (Shift Left) 

instruction definition, 10.1 
instruction timing, 10.3 

shra (Shift Right Arithmetic) 
instruction definition, 10.1 
instruction timing, 10.3 

shrd (Shift Right Double) 
instruction definition, 10.1 
instruction timing, 10.3 



shr (Shift Right) 

instruction definition, 10.1 
instruction timing, 1 0.3 

signal description 

hardware interface, 4.2 

single-precision real 
datatype, 2.1.3 

single-transfer cycle 
bus cycle, 5.1.1 

SI (sticky inexact) 

fsr format description, 2.2.8 

snooping 

inquiry cycles, 5.3 

internal instruction and data caches, 3.2 

responsibility limits, 5.3.2 

software compatibility 
required changes, 10.5.1 

SO (strong ordering) 

epsr format description, 2.2.4 

source exception fault 
floating-point, 2.8.3.1 

spare 

signal description, 4.2.38 

special bus 
cycles, 5.1.5 

special-purpose registers 
register set, 2.2 

special values 

floating-point numbers, 2.1.3 

STAT register 

DCCU internals, 2.5.3 
format description, 2.2.14 
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st.c (Store to Control Register) 
address translation, 2.4 
dirbase BL (bus lock), 2.2.6 
dirbase CS8 (code size 8-bit), 2.2.6 
fsr U-bit (update bit), 2.2.8 
instruction definition, 10.1 
instruction timing, 10.3 
privileged registers, 2.2.11 

stepping number 

epsr format description, 2.2.4 

stio.x (Store I/O) 

big endian mode, 2.3 
epsr BE (big endian), 2.2.4 
extensions of i860 XR, 2.6 
instruction definition, 10.1 
instruction timing, 10.3 

strong ordering mode 
inquiry cycle, 5.3.4 

st.x (Store Integer) 
DCCU internals, 2.5.3 
instruction definition, 10.1 
instruction timing, 10.3 

subs (Subtract Signed) 

epsr OF (overflow flag), 2.2.4 
instruction definition, 10.1 
instruction timing, 10.3 

subu (Subtract Unsigned) 
epsr OF (overflow flag), 2.2.4 
instruction definition, 10.1 
instruction timing, 10.3 

supervisor/user mode 
addressing, 2.3 

ccr (concurrency control register), 2.2.12 
psr U (user mode), 2.2.3 



special purpose register description, 2.2.9 



tags 



internal instruction and data caches, 3.2 



TAI (Trap On Autoincrement) 
epsr format description, 2.2.4 
fsr U-bit (update bit), 2.2.8 

TAP (test access port) 
controller, 6.4 
controller initialization, 6.6 
testability, 6.0 

TCK (test clock) 

signal description, 4.2.39 

TDI (test data input) 

signal description, 4.2.40 

TDO (test data output) 
signal description, 4.2.41 

test 

architecture, 6.1 
data registers, 6.2 

testability 

i860 XP microprocessor, 6.0 

test-logic-reset 
test state, 6.4.1 

test state 

capture-DR, 6.4.5 
capture-IR, 6.4.11 
exit1-DR, 6.4.7 
exit1-IR, 6.4.13 
exit2-DR, 6.4.9 
exit2-IR, 6.4.15 
pause-DR, 6.4.8 
pause-IR, 6.4.14 
run-test/idle, 6.4.2 
select-DR-scan, 6.4.3 
select-IR-scan, 6.4.4 
shift-DR, 6.4.6 
shift-IR, 6.4.12 
test-logic-reset, 6.4.1 
update-DR, 6.4.10 
update-IR, 6.4.16 

thermal specifications 
package, 8.0 
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Tl (trap inexact) 

fsr format description, 2.2.8 

TLB 

address translation caches, 3.1 
DCCU addressing, 2.5.2 
internal cache consistency, 3.3 

TMS (test mode select) 
signal description, 4.2.42 

trap handler 
invocation, 2.8.1 
page tables, 2.4.4.7 

trap (Software Trap) 
bus error, 2.8.7 
i860 XP microprocessor, 2.8 
instruction cache consistency, 3.3.2 
instruction definition, 10.1 
instruction timing, 10.3 
interrupt, 2.8.8 
parity error, 2.8.6 
RESET, 2.8.9 

tri-state 

output pins, 4.1 

TRST# (test reset) 

signal description, 4.2.43 

U 

U-bit (update bit) 

fsr format description, 2.2.8 

underflow 

result exception fault, 2.8.3.2 

unlock (End Interlocked Sequence) 
dirbase BL (bus lock), 2.2.6 
epsr IL (interlock), 2.2.4 
instruction definition, 10.1 
instruction timing, 10.3 

update-DR 

test state, 6.4.10 



update-IR 

test state, 6.4.16 

user/supervisor mode 

ccr (concurrency control register), 2.2.12 
psr U (user mode), 2.2.3 

U (user) 

page-table entries (PTEs), 2.4.4.3 
psr format description, 2.2.3 



VccCLK (clock power) 
signal description, 4.2.45 

Vcc (system ground) 
signal description, 4.2.44 

virtual address 

address translation caches, 3.1 
CCUBASE, 2.2.12 
format description, 2.4.2 
i860 XP microprocessor, 2.4 

virtual tag 

instruction cache, 3.2.2 

internal instruction and data caches, 3.2 

Vss (ground) 

signal description, 4.2.44 

W 

wait state 

single-transfer cycle, 5.1.1 

WB/WT# (write-back/write-through) 
signal description, 4.2.46 
write-once policy, 3.2.4.2 

WP (write protect) 

epsr format description, 2.2.4 
page-table entries (PTEs), 2.4.4.3 

W/R# (write/read) 

signal description, 4.2.47 
write-once policy, 3.2.4.2 
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write-back 

data cache update policy, 3.2.1.1 
with FLINE#, 5.3.5.2 
inquiry cycles, 5.3.1 
scheduling inquiry cycles, 5.3.5 

write cycle 

reordering due to buffering, 5.3.3 

write-once 

cache consistency, 3.2.4.2 
data cache update policy, 3.2.1.1 

write-through 

data cache update policy, 3.2.1.1 

WT (write-through) 

page-table entries (PTEs), 2.4.4.4 
write-through policy, 3.2.1.1 

W (writable) 

page-table entries (PTEs), 2.4.4.3 



xorh (Logical Exclusive OR High) 
instruction definition, 10.1 
instruction timing, 10.3 

xor (Logical Exclusive OR) 
instruction definition, 10.1 
instruction timing, 10.3 

Z 

Z-buffer 

special purpose registers, 2.2.9 
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Parallel Architecture that Supports Up 
to Three Operations per Clock 

— One Integer or Control Instruction 
per Clock 

— Up to Two Floating-Point Results per 
Clock 

High Performance Design 

— 25/33.3/40 MHz Clock Rates 

— 80 Peak Single Precision MFLOPs 

— 60 Peak Double Precision MFLOPs 

— 64-Bit External Data Bus 

— 64-Bit Internal Instruction Cache Bus 

— 128-Bit Internal Data Cache Bus 

High Level of Integration on One Chip 

— 32-Bit Integer and Control Unit 

— 32/64-Bit Pipelined Floating-Point 
Adder and Multiplier Units 

— 64-Bit 3-D Graphics Unit 

— Paging Unit with Translation 
Lookaside Buffer 

— 4 Kbyte Instruction Cache 

— 8 Kbyte Data Cache 



Compatible with Industry Standards 

— ANSI/IEEE Standard 754-1985 for 
Binary Floating-Point Arithmetic 

— lntel386TM/486TM Microprocessor 
Data Formats and Page Table Entries 

— JEDEC 168-pin Ceramic Pin Grid 
Array Package (see Packaging 
Outlines and Dimensions, order 
#231369) 

Easy to Use 

— On-Chip Debug Register 

— Assembler, Linker, Simulator, 
Debugger, C and FORTRAN 
Compilers, FORTRAN Vectorizer, 
Scalar and Vector Math Libraries for 
both OS/2* and UNIX* Environments 



The Intel i860™ XR Microprocessor (order codes A80860XR-25, A80860XR-33 and A80860XR-40) delivers 
supercomputing performance in a single VLSI component. The 64-bit design of the i860 XR microprocessor 
balances integer, floating point, and graphics performance for applications such as engineering workstations, 
scientific computing, 3-D graphics workstations, and multiuser systems. Its parallel architecture achieves high 
throughput with RISC design techniques, pipelined processing units, wide data paths, large on-chip caches, 
million-transistor design, and fast one-micron CHMOS IV silicon technology. 
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Figure 0.1. Block Diagram 

Intel, int e l, lnte!386TM, !ntel486TM, i860 XR, Multibus II and Parallel System Bus are trademarks of Intel Corporation. 
"UNIX is a registered trademark of UNIX System Laboratories, Inc. OS/2 is a trademark of International Business Machines 
Corporation. 
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1.0 FUNCTIONAL DESCRIPTION 

As shown by the block diagram on the front page, 
the i860 XR microprocessor consists of 9 units: 

1 . Core Execution Unit 

2. Floating-Point Control Unit 

3. Floating-Point Adder Unit 

4. Floating-Point Multiplier Unit 

5. Graphics Unit 

6. Paging Unit 

7. Instruction Cache 

8. Data Cache 

9. Bus and Cache Control Unit 

The core execution unit controls overall operation of 
the i860 XR microprocessor. The core unit executes 
load, store, integer, bit, and control-transfer opera- 
tions, and fetches instructions for the floating-point 
unit as well. A set of 32 x 32-bit general-purpose 
registers are provided for the manipulation of integer 
data. Load and store instructions move 8-, 16-, and 
32-bit data to and from these registers. Its full set of 
integer, logical, and control-transfer instructions give 
the core unit the ability to execute complete systems 
software and applications programs. A trap mecha- 
nism provides rapid response to exceptions and ex- 
ternal interrupts. Debugging is supported by the abili- 
ty to trap on data or instruction reference. 

The floating-point hardware is connected to a sepa- 
rate set of floating-point registers, which can be 
accessed as 1 6 x 64-bit registers, or 32 x 32-bit reg- 
isters. Special load and store instructions can also 
access these same registers as 8 x 1 28-bit registers. 
All floating-point instructions use these registers as 
their source and destination operands. 

The floating-point control unit controls both the float- 
ing-point adder and the floating-point multiplier, issu- 
ing instructions, handling all source and result 
exceptions, and updating status bits in the floating- 
point status register. The adder and multiplier can 
operate in parallel, producing up to two results per 
clock. The floating-point data types, floating-point in- 
structions, and exception . handling all support the 
IEEE Standard for Binary Floating-Point Arithmetic 
(ANSI/IEEE Std 754-1985). 

The floating-point adder performs addition, subtrac- 
tion, comparison, and conversions on 64- and 32-bit 
floating-point values. An adder instruction executes 
in three clocks; however, in pipelined mode, a new 
result is generated every clock. 

The floating-point multiplier performs floating-point 
and integer multiply and floating-point reciprocal op- 
erations on 64- and 32-bit floating-point values. A 
multiplier instruction executes in three to four clocks; 



however, in pipelined mode, a new result can be 
generated every clock for single-precision and every 
other clock for double precision. 

The graphics unit has special integer logic that sup- 
ports three-dimensional drawing in a graphics frame 
buffer, with color intensity shading and hidden sur- 
face elimination via the Z-buffer algorithm. The 
graphics unit recognizes the pixel as an 8-, 16-, or 
32-bit data type. It can compute individual red, blue, 
and green color intensity values within a pixel; but it 
does so with parallel operations that take advantage 
of the 64-bit internal word size and 64-bit external 
bus. The graphics features of the i860 XR micro- 
processor assume that the surface of a solid object 
is drawn with polygon patches whose shapes ap- 
proximate the original object. The color intensities of 
the vertices of the polygon and their distances from 
the viewer are known, but the distances and intensi- 
ties of the other points must be calculated by inter- 
polation. The graphics instructions of the i860 XR 
microprocessor directly aid such interpolation. 

The paging unit implements protected, paged, virtual 
memory via a 64-entry, four-way set-associative 
memory called the TLB (Translation Lookaside Buff- 
er). The paging unit uses the TLB to perform the 
translation of logical address to physical address, 
and to check for access violations. The access pro- 
tection scheme employs two levels of privilege: user 
and supervisor. 

The instruction cache is a two-way set-associative 
memory of four Kbytes, with 32-byte blocks. It trans- 
fers up to 64 bits per clock (320 Mbyte/sec at 
40 MHz). 

The data cache is a two-way set-associative memo- 
ry of eight Kbytes, with 32-byte blocks. It transfers 
up to 128 bits per clock (640 Mbyte/sec at 40 MHz). 
The i860 XR microprocessor normally uses write- 
back caching, i.e. memory writes update the cache 
(if applicable) without necessarily updating memory 
immediately; however, caching can be inhibited by 
software where necessary. 

The bus and cache control unit performs data and 
instruction accesses for the core unit. It receives cy- 
cle requests and specifications from the core unit, 
performs the data-cache or instuction-cache miss 
processing, controls TLB translation, and provides 
the interface to the external bus. Its pipelined struc- 
ture supports up to three outstanding bus cycles. 



2.0 PROGRAMMING INTERFACE 

The programmer-visible aspects of the architecture 
of the i860 XR microprocessor include data types, 
registers, instructions, and traps. 
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2.1 Data Types 

The i860 XR microprocessor provides operations for 
integer and floating-point data. Integer operations 
are performed on 32-bit operands with some support 
also for 64-bit operands. Load and store instructions 
can reference 8-bit, 16-bit, 32-bit, 64-bit, and 128-bit 
operands. Floating-point operations are performed 
on IEEE-standard 32- and 64-bit formats. Graphics 
oriented instructions operate on arrays of 8-, 16-, or 
32-bit pixels. 



2.1.2 ORDINAL 

Arithmetic operations are available for 32-bit ordi- 
nals. An ordinal is an unsigned integer. An ordinal 
can represent values in the range to 
4,294,967,295 ( + 232-1). 

Also, there are add and subtract instructions that op- 
erate on 64-bit ordinals. 



2.1.3 SINGLE- AND DOUBLE-PRECISION REAL 



2.1.1 INTEGER 

An integer is a 32-bit signed value in standard two's 
complement form. A 32-bit integer can represent a 
value in the range -2,147,483,648 (-231) to 
2,147,483,647 (+231 - -|). Arithmetic operations on 
8- and 1 6-bit integers can be performed by sign-ex- 
tending the 8- or 16-bit values to 32 bits, then using 
the 32-bit operations. 

There are also add and subtract instructions that op- 
erate on 64-bit long integers. 

Load and store instructions may also reference (in 
addition to the 32- and 64-bit formats previously 
mentioned) 8- and 16-bit items in memory. When an 
8- or 1 6-bit item is loaded into a register, it is con- 
verted to an integer by sign-extending the value to 
32 bits. When an 8- or 16-bit item is stored from a 
register, the corresponding number of low-order bits 
of the register are used. 



Figure 2.1 shows the real number formats. A single- 
precision real (also called "single real") data type is 
a 32-bit binary floating-point number. Bit 31 is the 
sign bit; bits 30..23 are the exponent; and bits 22..0 
are the fraction. In accordance with ANSI/IEEE 
standard 754, the value of a single-precision real is 
defined as follows: 

1. If e = and f ^ or e ■ = 255 then generate a 
floating-point source-exception trap when en- 
countered in a floating-point operation. 

2. If < e < 255, then the value is (-l)s x 1.f x 
2 e ~127. 

3. If e = and f = 0, then the value is signed zero. 

A double-precision real (also called "double real") 
data type is a 64-bit binary floating-point number. Bit 
63 is the sign bit; bits 62..52 are the exponent; and 
bits 51. .0 are the fraction. In accordance with ANSI/ 
IEEE standard 754, the value of a double-precision 
real is defined as follows: 

1. If e = and f ¥r or e = 2047, then generate a 
floating-point source-exception trap when en- 
countered in a floating-point operation. 

2. If < e < 2047, then the value is (— 1)s x 1.f x 
2e-1023. 




Single-Precision Real 



s 





f 



FRACTION 
EXPONENT 
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Figure 2.1. Real Number Formats 
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3. If e = and f = 0, then the value is signed zero. 

The special values infinity, NaN ("Not a Number"), 
indefinite, and denormal generate a trap when en- 
countered. The trap handler implements IEEE-stan- 
dard results. 

A double real value occupies an even/odd pair of 
floating-point registers. Bits 31. .0 are stored in the 
even-numbered floating-point register; bits 63. .32 
are stored in the next higher odd-numbered floating- 
point register. 



2.1.4 PIXEL 

A pixel may be 8, 16, or 32 bits long depending on 
color and intensity resolution requirements. Regard- 
less of the pixel size, the i860 XR microprocessor 
always operates on 64 bits worth of pixels at a time. 
The pixel data type is used by two kinds of instruc- 
tions: 

• The selective pixel-store instruction that helps im- 
plement hidden surface elimination. 

• The pixel add instruction that helps implement 
3-D color intensity shading. 

To perform color intensity shading efficiently in a va- 
riety of applications, the i860 XR microprocessor de- 
fines three pixel formats according to Table 2.1. 

Figure 2.2 illustrates one way of assigning meaning 
to the fields of pixels. These assignments are for 
illustration purposes only. The i860 XR microproces- 
sor defines only the field sizes, not the specific use 
of each field. Other ways of using the fields of pixels 
are possible. 



Table 2.1. Pixel Formats 



Pixel 

Size 

(in bits) 



8 
16 
32 



Bits of 
Color 1 
Intensity 



Bits of 
Color 2 
Intensity 



Bits of 
Color 3 
Intensity 



N(^ 8) bits of intensity* 



Bits of 

Other 

Attribute 

(Texture) 



8- N 



8 



The intensity attribute fields may be assigned to colors in 
any order convenient to the application. 

*With 8-bit pixels, up to 8 bits can be used for intensity; the 
remaining bits can be used for any other attribute, such as 
color. The intensity bits must be the low-order bits of the 
pixel. 



2.2 Register Set 

As Figure 2.3 shows, the i860 XR microprocessor 
has the following registers: 

• An integer register file 

• A floating-point register file 

• Six control registers (psr, epsr, db, dirbase, fir, 
and fsr) 

• Four special-purpose registers (KR, Kl, T, and 
MERGE) 

The control registers are accessible only by load 
and store control-register instructions; the integer 
and floating-point registers are accessed by arithme- 
tic operations and load and store instructions. The 
special-purpose registers KR, Kl, T, and MERGE are 
used by a few specific instructions. * 
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Figure 2.2. Pixel Format Example 
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2.2.1 INTEGER REGISTER FILE 

There are 32 integer registers, each 32 bits wide, 
referred to as rO through r31, which are used for 
address computation and scalar integer computa- 
tions. Register rO always returns zero when read, 
independently of what is stored in it. 

2.2.2 FLOATING-POINT REGISTER FILE 

There are 32 floating-point registers, each 32-bits 
wide, referred to as fO through f31, which are used 
for floating-point computations. Registers fO and 11 
always return zero when read, independently of 
what is stored in them. The floating-point registers 
are also used by a set of graphics operations, pri- 
marily for 3D graphics computations. 

When accessing 64-bit floating-point or integer val- 
ues, the i860 XR microprocessor uses an even/odd 
pair of registers. When accessing 128-bit values, it 

uses an aligned set of four registers (fO, f4, f8 

f28). The instruction must designate the lowest reg- 
ister number of the set of registers containing 64- or 
128-bit values. Misaligned register numbers produce 
undefined results. The register with the lowest num- 
ber contains the least significant part of the value. 
For 128-bit values, the register pair with the lower 
numbers contain the least significant 64 bits while 
the register pair with the higher numbers contain the 
most significant 64 bits. 



The 128-bit load and store instructions, along with 
the 1 28-bit data path between the floating-point reg- 
isters and the data cache help to sustain the extraor- 
dinarily high rate of computation. 

2.2.3 PROCESSOR STATUS REGISTER 

The processor status register (psr) contains miscel- 
laneous state information for the current process. 
Figure 2.4 shows the format of the psr. 

• BR (Break Read) and BW (Break Write) enable a 
data access trap when the operand address 
matches the address in the db register and a 
read or write (respectively) occurs. 

• Various instructions set CC (Condition Code) ac- 
cording to tests they perform. The branch-on- 
condition-code instructions test its value. The bla 
instruction sets and tests LCC (Loop Condition 
Code). 

o IM (Interrupt Mode) enables external interrupts if 
set; disables interrupts if clear. 

© U (User Mode) is set when the i860 XR micro- 
processor is executing in user mode; it is clear 
when the i860 XR microprocessor is executing in 
supervisor mode. In user mode, writes to some 
control registers are inhibited. This bit also con- 
trols the memory protection mechanism. See 
section 2.4.4.3 for a description of memory pro- 
tection in user and supervisor modes. 
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BREAK READ 

BREAK WRITE 

CONDITION CODE 
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SC 


X 
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T 


1 

A 
T 


1 
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Figure 2.4 Processor Status Register 
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Figure 2.5 Extended Processor Status Register 



PIM (Previous Interrupt Mode) and PU (Previous 
User Mode) save the corresponding status bits 
(IM and U) on a trap, because those status bits 
are changed when a trap occurs. They are re- 
stored into their corresponding status bits when 
returning from a trap handler with a branch indi- 
rect instruction when a trap flag is set in the psr. 

FT (Floating-Point Trap), DAT (Data Access 
Trap), IAT (Instruction Access Trap), IN (Inter- 
rupt), and IT (Instruction Trap) are trap flags. 
They are set when the corresponding trap condi- 
tion occurs. The trap handler examines these bits 
to determine which condition or conditions have 
caused the trap. 



DS (Delayed Switch) is set if a trap occurs during 
the instruction before dual-instruction mode is en- 
tered or exited. If DS is set and DIM (Dual Instruc- 
tion Mode) is clear, the i860 XR microprocessor 
switches to dual-instruction mode one instruction 
after returning from the trap handler. If DS and 
DIM are both set, the i860 XR microprocessor 
switches to single-instruction mode one instruc- 
tion after returning from the trap handler. 

When a trap occurs, the. i860 XR microprocessor 
sets DIM if it is executing in dual-instruction 
mode; it clears DIM if it is executing in single-in- 
struction mode. If DIM is set after returning from a 
trap handler, the i860 XR microprocessor re- 
sumes execution in dual-instruction mode. 
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When KNF (Kill Next Floating-Point Instruction) is 
set, the next floating-point instruction is sup- 
pressed (except that its dual-instruction mode bit 
is interpreted). A trap handler sets KNF if the 
trapped floating-point instruction should not be 
reexecuted. 

SC (Shift Count) stores the shift count used by 
the last right-shift instruction. It controls the num- 
ber of shifts executed by the double-shift instruc- 
tion. 

PS (Pixel Size) and PM (Pixel Mask) are used by 
the pixel-store instruction and by the graphics in- 
structions. The values of PS control pixel size as 
defined by Table 2.2. The bits in PM correspond 
to pixels to be updated by the pixel-store instruc- 
tion pst.d. The low-order bit of PM corresponds 
to the low-order pixel of the 64-bit source oper- 
and of pst.d. The number of low-order bits of PM 
that are actually used is the number of pixels that 
fit into 64-bits, which depends upon PS. If a bit of 
PM is set, then pst.d stores the corresponding 
pixel. Refer also to the pst.d instruction in section 



Table 2.2. Values of PS 



Value 


Pixel Size 
in bits 


Pixel Size 
in bytes 


00 
01 
10 

11 


8 

16 

32 

(undefined) 


1 

2 

4 

(undefined) 



2.2.4 EXTENDED PROCESSOR STATUS 
REGISTER 

The extended processor status register (epsr) con- 
tains additional state information for the current pro- 
cess beyond that stored in the psr. Figure 2.5 shows 
the format of the epsr. 

© The processor type is one for the i860 XR micro- 
processor. 

© The stepping number has a unique value that dis- 
tinguishes among different revisions of the proc- 
essor. 

• IL (Interlock) is set if a trap occurs after a lock 
instruction but before the load or store following 
the subsequent unlock instruction. IL indicates to 
the trap handler that a locked sequence has 
been interrupted. When the trap handler finds IL 
set, it should scan backwards for the lock in- 
struction and restart at that point. The absence of 
a lock instruction within 30-33 instructions of the 
trap indicates a programming error. 

© WP (write protect) controls the semantics of the 
W bit of page table entries. A clear W bit in either 
the directory or the page table entry causes 
writes to be trapped. When WP is clear, writes 
are trapped in user mode, but not in supervisor 
mode. When WP is set, writes are trapped in both 
user and supervisor modes. After the value of the 
WP bit is changed, the TLB must be invalidated 
by setting the ITI bit of the dirbase register, be- 
fore any stores are performed. 

• INT (Interrupt) is the value of the INT input pin. 

o DCS (Data Cache Size) is a read-only field that 
tells the size of the on-chip data cache. The num- 
ber of bytes actually available is 2 12 + DCS ; there- 
fore, a value of zero indicates 4 Kbytes, one indi- 
cates 8 Kbytes, etc. 
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© PBM (Page-Table Bit Mode) determines which bit 
of page-table entries is output on the PTB pin. 
When PBM is clear, the PTB signal reflects bit CD 
of the page-table entry used for the current cycle. 
When PBM is set, the PTB signal reflects bit WT 
of the page-table entry used for the current cycle. 

• BE (Big Endian) controls the ordering of bytes 
within a data item in memory. Normally (i.e. when 
BE is clear) the i860 XR microprocessor operates 
in little endian mode, in which the addressed byte 
is the low-order byte. When BE is set (big endian 
mode), the low-order three bits of all load and 
store addresses are complemented, then 
masked to the appropriate boundary for align- 
ment. This causes the addressed byte to be the 
most significant byte. Section 2.3 discusses little 
and big endian addressing. 

° OF (Overflow Flag) is set by adds, addu, subs, 
and subu when integer overflow occurs. For 
adds and subs, OF is set if the carry from bit 31 
is different than the carry from bit 30. For addu, 
OF is set if there is a carry from bit 31 . For subu, 
OF is set if there is no carry from bit 31. Under all 
other conditions, it is cleared by these instruc- 
tions. OF controls the function of the intovr 
instruction. OF cannot be written in user mode 
using ST.C. 

2.2.5 DATA BREAKPOINT REGISTER 

The data breakpoint register (db) is used to gener- 
ate a trap when the i860 XR microprocessor makes 
a data-operand access to the address stored in this 
register. The trap is enabled by BR and BW in psr. 
The db register can only be changed from supervi- 
sor level. When comparing, a number of low order 
bits of the address are ignored, depending on the 
size of the operand. For example, a 16-bit access 
ignores the low-order bit of the address when com- 
paring to db; a 32-bit access ignores the low-order 
two bits. This ensures that any access that overlaps 
the address contained in the register will generate a 
trap. The DAT occurs before the data is accessed 
and prevents the load or store from completing. 

2.2.6 DIRECTORY BASE REGISTER 

The directory base register dirbase (shown in Figure 
2.6) controls address translation, caching, and bus 
options. The dirbase register can only be changed 
from supervisor level. The BL bit is changed from 
user level with the lock and unlock instructions. 

,• ATE (Address Translation Enable), when set, en- 
ables the virtual-address translation algorithm. 
The data cache must be flushed before changing 
the ATE bit. 

o DPS (DRAM Page Size) controls how many bits 
to ignore when comparing the current bus-cycle 



address with the previous bus-cycle address to 
generate the NENE# signal. This feature allows 
for higher speeds when using static column or 
page-mode DRAMs and consecutive reads and 
writes access the row. The comparison ignores 
the low-order 12 + DPS bits. A value of zero is 
appropriate for one bank of 256K x n RAMs, 1 
for 1M x /? RAMS, etc. For interleaved memory, 
increase DPS by one for each power of interleav- 
ing — add one for 2-way, and two for 4-way, etc. 

° When BL (Bus Lock) is set, external bus access- 
es are locked. The LOCK# signal is asserted the 
next bus cycle whose internal bus request is gen- 
erated after BL is set. It remains set on every 
subsequent bus cycle as long as BL remains set. 
The LOCK# signal is deasserted on the next 
load or store instruction after BL is cleared. Traps 
immediately clear BL. The lock and unlock 
instructions control the BL bit. The result of modi- 
fying BL with the st.c instruction is not defined. 

° ITI (l-Cache, TLB Invalidate), when set in the val- 
ue that is loaded into dirbase, causes all entries 
in the instruction cache and address-translation 
cache (TLB) to be invalidated. The ITI bit does 
not remain set in dirbase. ITI always appears as 
zero when reading dirbase. Section 2.5 discuss- 
es flushing the data cache before invalidating the 
TLB. 

o When CS8 (Code Size 8-Bit) is set, instruction 
cache misses are processed as 8-bit bus cycles. 
When this bit is clear, instruction cache misses 
are processed as 64-bit bus cycles. This bit can 
not be set by software; hardware sets this bit at 
initialization time. It can be cleared by software 
(one time only) to allow the system to execute out 
of 64-bit memory after bootstrapping from 8-bit 
EPROM. A nondelayed branch to code in 64-bit 
memory should directly follow the st.c (store con- 
trol register) instruction that clears CS8, in order 
to make the transition from 8-bit to 64-bit memory 
occur at the correct time. The branch instruction 
must be aligned on a 64-bit boundary. 

° RB (Replacement Block) identifies the cache 
block to be replaced by cache replacement algo- 
rithms. The high-order bit of RB is ignored by the 
instruction and data caches. RB conditions the 
cache flush instruction flush, which is discussed 
in Section 8. Table 2.3 explains the values of RB. 

© RC (Replacement Control) controls cache re- 
placement algorithms. Table 2.4 explains the sig- 
nificance of the values of RC. 

•■ DTB (Directory Table Base) contains the high-or- 
der 20 bits of the physical address of the page 
directory when address translation is enabled (i.e. 
ATE = 1). The low-order 12 bits of the address 
are zeros. 
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Figure 2.7. Floating-Point Status Register 





Table 2.3. Values of RB 


Value 


Replace 
TLB Block 


Replace Instruction 
and Data Cache Block 



■0 1 

1 

1 1 



1 
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3 
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1 



Table 2.4. Values of RC 



Value 


Meaning 


00 


Selects the normal replacement 
algorithm where any block in the set 
may be replaced on cache misses in all 
caches. 


01 


Instruction, data, and TLB cache 
misses replace the block selected by 
RB. The instruction and data caches 
ignore the high-order bit of RB. This 
mode is used for instruction cache and 
TLB testing. 


10 


Data cache misses replace the block 
selected by the low-order bit of RB. 
Instruction and TLB caches use 
random replacement. 


11 


Disables data cache replacement. 
Instruction and TLB caches use 
random replacement. 



2.2.7 FAULT INSTRUCTION REGISTER 

When a trap occurs, this register contains the ad- 
dress of the trapping instruction (not necessarily the 
instruction that created the conditions that required 
the trap). The fir is a read-only register. In single-in- 
struction mode, using a Id.c instruction to read the 
fir anytime except the first time after a trap saves in 
idest the address of the Id.c instruction; in dual-in- 
struction mode, the address of its floating-point com- 
panion (address of the Id.c - 4) is saved. 

2.2.8 FLOATING-POINT STATUS REGISTER 

The floating-point status register (fsr) contains the 
floating-point trap and rounding-mode status for the 
current process. Figure 2.7 shows its format. The fsr 
is writable in user level. 

• If FZ (Flush Zero) is clear and underflow occurs, 
a result-exception trap is generated. When FZ is 
set and underflow occurs, the result is set to zero, 
and no trap due to underflow occurs. 

• If Tl (Trap Inexact) is clear, inexact results do not 
cause a trap. If Tl is set, inexact results cause a 
trap. The sticky inexact flag (SI) is set whenever 
an inexact result is produced, regardless of the 
setting of Tl. 

• RM (Rounding Mode) specifies one of the four 
rounding modes defined by the IEEE standard. 
Given a true result b that cannot be represented 
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Table 2.5. Values of RM 



Value 


Rounding Mode 


Rounding Action 


00 


Round to nearest or even 


Closer to b of a or c\ if equally 
close, select even number 
(the one whose least 
significant bit is zero). 


01 


Round down (toward - °°) 


a 


10 


Round up (toward + °° 


c 


11 


Chop (toward zero) 


Smaller in magnitude of a or c. 



by the target data type, the i860 XR microproces- 
sor determines the two representable numbers a 
and c that most closely bracket b in value (a < b 
< c). The i860 XR microprocessor then rounds 
(changes) b to a or c according to the mode se- 
lected by RM as defined in Table 2.5. Rounding 
introduces an error in the result that is less than 
one least-significant bit. 

The U-bit (Update Bit), if set in the value that is 
loaded into fsr by a st.c instruction, enables up- 
dating of the result-status bits (AE, AA, Al, AO, 
AU, MA, Ml, MO, and MU) in the first-stage of the 
floating-point adder and multiplier pipelines. If this 
bit is clear, the result-status bits are unaffected 
by a st.c instruction; st.c ignores the correspond- 
ing bits in the value that is being loaded. A st.c 
always updates fsr bits 21.. 17 and 8..0 directly. 
The U-bit does not remain set; it always appears 
as zero when read. 

The FTE (Floating-Point Trap Enable) bit, if clear, 
disables all floating-point traps (invalid input oper- 
and, overflow, underflow, and inexact result). 

SI (Sticky Inexact) is set when the last stage re- 
sult of either the multiplier or adder is inexact (i.e. 
when either Al or Ml is set). SI is "sticky" in the 
sense that it remains set until reset by software. 
Al and Ml, on the other hand, can by changed by 
the subsequent floating-point instruction. 

SE (Source Exception) is set when one of the 
source operands of a floating-point operation is 
invalid; it is cleared when all the input operands 
are valid. Invalid input operands include denor- 
mals, infinities, and all NaNs (both quiet and sig- 
naling). 

When read from the fsr, the result-status bits MA, 
Ml, MO, and MU (Multiplier Add-One, Inexact, 
Overflow, and Underflow, respectively) describe 
the last stage result of the multiplier. 

When read from the fsr, the result-status bits AA, 
Al, AO, AU, and AE (Adder Add-One, Inexact, 
Overflow, Underflow, and Exponent, respectively) 
describe the last stage result of the adder. The 
high-order three bits of the 1 1 -bit exponent of the 
adder result are stored in the AE field. 

The Adder Add One and Multiplier Add One bits 
indicate that the absolute value of the result frac- 



tion grew by one least-significant bit due to 
rounding. AA and MA are not influenced by the 
sign of the result. 

After a floating-point operation in a given unit (ad- 
der or multiplier), the result-status bits of that unit 
are undefined until the point at which result ex- 
ceptions are reported. 

When written to the fsr with the U-bit set, the 
result-status bits are placed into the first stage of 
the adder and multiplier pipelines. When the 
processor executes pipelined operations, it prop- 
agates the result-status bits of a particular unit 
(multiplier or adder) one stage for each pipelined 
floating-point operation for that unit. When they 
reach the last stage, they replace the normal re- 
sult-status bits in the fsr. When the U-bit is not 
set, result-status bits in the word being written to 
the fsr are ignored. 

In a floating-point dual-operation instruction (e.g. 
add-and-multiply or subtract-and-multiply), both 
the multiplier and the adder may set exception 
bits. The result-status bits for a particular unit re- 
main set until the next operation that uses that 
unit. 

RR (Result Register) specifies which floating- 
point register (f0-f31) was the destination regis- 
ter when a result-exception trap occurs due to a 
scalar operation. 

LRP (Load Pipe Result Precision), IRP (Integer 
(Graphics) Pipe Result Precision), MRP (Multiplier 
Pipe Result Precision), and ARP (Adder Pipe Re- 
sult Precision) aid in restoring pipeline state after 
a trap or process switch. Each defines the preci- 
sion of the last stage result in the corresponding 
pipeline. One of these bits is set when the result 
in the last stage of the corresponding pipeline is 
double precision; it is cleared if the result is single 
precision. These bits cannot be changed by soft- 
ware. 



2.2.9 KR, Kl, T, AND MERGE REGISTERS 

The KR, Kl, and T registers are special-purpose reg- 
isters used by the dual-operation floating-point 
instructions pfam, pfmam, pfsm, and pfmsm, 
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which initiate both an adder (A-unit) operation and a 
multiplier (M-unit) operation. The KR, Kl, and T regis- 
ters can store values from one dual-operation in- 
struction and supply them as inputs, to subsequent 
dual-operation instructions. (Refer to Figure 2.14.) 

The MERGE register is used only by the graphics 
instructions. The purpose of the MERGE register is 
to accumulate (or merge) the results of multiple-ad- 
dition operations that use as operands the color-in- 
tensity values from pixels or distance values from a 
Z-buffer. The accumulated results can then be 
stored in one 64-bit operation. 

Two multiple-addition instructions and an OR in- 
struction use the MERGE register. The addition in- 
structions are designed to add interpolation values 
to each color-intensity field in an array of pixels or to 
each distance value in a Z-buffer. 

Refer to the instruction descriptions in section 8 for 
more information about these registers. 



2.3 Addressing 

Memory is addressed in byte units with a paged vir- 
tual-address space of 232 bytes. Data and instruc- 
tions can be located anywhere in this address 
space. Address arithmetic is performed using 32-bit 
input values and produces 32-bit results. The low-or- 
der 32 bits of the result are used in case of overflow. 

Normally, multibyte data values are stored in memo- 
ry in little endian format, i.e., with the least significant 
byte at the lowest memory address. As an option, 
the ordering can be dynamically selected by soft- 
ware in supervisor mode. The i860 XR microproces- 
sor also offers big endian mode, in which the most 
significant byte of a data item is at the lowest ad- 
dress. Figure 2.8 shows the difference between the 
two storage modes. Big endian and little endian data 
areas should not be mixed within a 64-bit data word. 
Illustrations of data structures in this data sheet 
show data stored in little endian mode, i.e., the low- 
order byte is at the lowest memory address. 



Code accesses are always done with little endian 
addressing. This implies that code will appear differ- 
ently than documented here when accessed as big 
endian data. Intel recommends that disassemblers 
running in a big endian system, convert instructions 
which have been read as data back to little endian 
form and present them in the format documented 
here. 

Page directories and page tables are also accessed 
in little endian mode, regardless of the value of the 
BE bit. 

Alignment requirements are as follows (any violation 
results in a data-access trap): 

• 128-bit values are aligned on 16-byte boundaries 
when referenced in memory (i.e. the four least 
significant address bits must be zero). 



-byte boundaries 
3. the three least 
zero). 

-byte boundaries 
e. the two least 
zero). 

-byte boundaries 
. the least signifi- 



o 64-bit values are aligned on 8 
when referenced in memory (i.e 
significant address bits must be 

• 32-bit values are aligned on 4 
when referenced in memory (i. 
significant address bits must be 

* 16-bit values are aligned on 2 
when referenced in memory (i.e, 
cant address bit must be zero). 

2.4 Virtual Addressing 



When address translation is enabled, the i860 XR 
microprocessor maps instruction and data virtual ad- 
dresses into physical addresses before referencing 
memory. This address transformation is compatible 
with that of the lntel386TM microprocessor and im- 
plements the basic features needed for page-orient- 
ed virtual-memory systems and page-level protec- 
tion. 

The address translation is optional. Address transla- 
tion is in effect only when the ATE bit of dirbase is 
set. This bit is typically set by the operating system 
during software initialization. The ATE bit must be 
set if the operating system is to implement page-ori- 
ented protection or page-oriented virtual memory. 
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Figure 2.9. Format of a Virtual Address 



Address translation is disabled when the processor 
is reset. It is enabled when a store to dirbase sets 
the ATE bit. It is disabled again when a store clears 
the ATE bit. 



2.4.1 PAGE FRAME 

A page frame is a 4-Kbyte unit of contiguous ad- 
dresses of physical main memory. Page frames be- 
gin on 4-Kbyte boundaries and are fixed in size. A 
page is the collection of data that occupies a page 
frame when that data is present in main memory. 
The data may also occupy some location in second- 
ary storage when there is not sufficient space in 
main memory.' 



2.4.2 VIRTUAL ADDRESS 

A virtual address refers indirectly to a physical ad- 
dress by specifying a page table, a page within that 



table, and an offset within that page. Figure 2.9 
shows the format of a virtual address. 

Figure 2.10 shows how the i860 XR microprocessor 
converts the DIR, PAGE, and OFFSET fields of a 
virtual address into the physical address by consult- 
ing two levels of page tables. The addressing mech- 
anism uses the DIR field as an index into a page 
directory, uses the PAGE field as an index into the 
page table determined by the page directory, and 
uses the OFFSET field to address a byte within the 
page determined by the page table. 



2.4.3 PAGE TABLES 

A page table is simply an array of 32-bit page specifi- 
ers. A page table is itself a page, and therefore con- 
tains 4 Kbytes of memory or at most 1 K 32-bit en- 
tries. 
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Figure 2.10. Address Translation 
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Two levels of tables are used to address a page of 
memory. At the higher level is a page directory. The 
page directory addresses up to 1K page tables of 
the second level. A page table of the second level 
addresses up to 1 K pages. All the tables addressed 
by one page directory, therefore, can address 1M 
pages (2 20 ). Because each page contains 4 Kbytes 
(2 12 bytes), the tables of one page directory can 
span the entire physical address space of the i860 
XR microprocessor (220 x 212 = 232). 

The physical address of the current page directory is 
stored in DTB field of the dirbase register. Memory 
management software has the option of using one 
page directory for all processes, one page directory 
for each process, or some combination of the two. 

2.4.4 PAGE-TABLE ENTRIES 

Page-table entries (PTEs) in either level of page ta- 
bles have the same format. Figure 2.11 illustrates 
this format. 



2.4.4.1 Page Frame Address 

The page frame address specifies the physical start- 
ing address of a page. Because pages are located 
on 4K boundaries, the low-order 12 bits are always 
zero. In a page directory, the page frame address is 
the address of a page table. In a second-level page 
table, the page frame address is the address of the 
page frame that contains the desired memory oper- 
and. 



2.4.4.2 Present Bit 

The P (present) bit indicates whether a page table 
entry can be used in address translation. P = 1 indi- 



cates that the entry can be used. When P = in 
either level of page tables, the entry is not valid for 
address translation, and the rest of the entry is avail- 
able for software use; none of the other bits in the 
entry is tested by the hardware. If P = in either 
level of page tables when an attempt is made to use 
a page-table entry for address translation, the proc- 
essor signals either a data-access fault or an in- 
struction-access fault. In software systems that sup- 
port paged virtual memory, the trap handler can 
bring the required page into physical memory. 

Note that there is no P bit for the page directory 
itself. The page directory may be not-present while 
the associated process is suspended, but the oper- 
ating system must ensure that the page directory 
indicated by the dirbase image associated.with the 
process is present in physical memory before the 
process is dispatched. 

2.4.4.3 Writable and User Bits 

The W (writable) and U (user) bits are used for page- 
level protection, which the i860 XR microprocessor 
performs at the same time as address translation. 
The concept of privilege for pages is implemented 
by assigning each page to one of two levels: 

1 . Supervisor level (U = 0)— for the operating sys- 
tem and other systems software and related data. 

2. User level (U = 1) — for applications procedures 
and data. 

The U bit of the psr indicates whether the i860 XR 
microprocessor is executing at user or supervisor 
level. The i860 XR microprocessor maintains the U 
bit of psr as follows: 
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Figure 2.1 1. Format of a Page Table Entry 



2-181 



i860TM XR MICROPROCESSOR 



^iyiM«f 



• The i860 XR microprocessor clears the psr U bit 
to indicate supervisor level when a trap occurs 
(including when the trap instruction causes the 
trap). The prior value of U is copied into PU. 

• The i860 XR microprocessor copies the psr PU 
bit into the U bit when an indirect branch is exe- 
cuted and one of the trap bits is set. If PU was 
one, the i860 XR microprocessor enters user 
level. 

With the U bit of psr and the W and U bits of the 
page table entries, the i860 XR microprocessor im- 
plements the following protection rules: 

• When at user level, a read or write of a supervi- 
sor-level page causes a trap. 

• When at user level, a write to a page whose W bit 
is clear causes a trap. 

© When at user level, st.c to certain control regis- 
ters is ignored. 

When the i860 XR microprocessor is executing at 
supervisor level, all pages are addressable, but, 
when it is executing at user level, only pages that 
belong to the user-level are addressable. 

When the i860 XR microprocessor is executing at 
supervisor level, all pages are readable. Whether a 
page is writable depends upon the write-protection 
mode controlled by WP of epsr: 

WP = 
WP = 1 



All pages are writable. 

A write to a page whose W bit is 
clear causes a trap. 



When the i860 XR microprocessor is executing at 
user level, only pages that belong to user level and 
are marked writable are actually writable; pages that 
belong to supervisor level are neither readable nor 
writable from user level. 

2.4.4.4 Write-Through Bit 

The i860 XR microprocessor does not implement a 
write-through caching policy for the on-chip data 
cache; however, the WT (write-through) bit in the 
second-level page-table entry does determine inter- 
nal caching policy. If WT is set in a PTE, on-chip 
caching of data from the corresponding page is in- 
hibited. The i860 XR CPU may place pages having 
WT = 1 into the instruction cache. Future imple- 
mentations of the i860 XR architecture may adhere 
to a write-through data caching policy. Therefore, 
they may cache pages having the WT bit of the PTE 
set. If WT is clear, the normal write-back policy is 
applied to data from the page in the on-chip caches. 
The WT bit of page directory entries is not refer- 
enced by the processor, but is reserved. 

The WT bit is independent of the CD bit; therefore, 
data may be placed in a second-level coherent 
cache, but kept out of the on-chip caches. 



2.4.4.5 Cache Disable Bit 

If the CD (cache disable) bit in the second-level 
page-table entry is set, data, from the associated 
page is not placed in instruction or data caches. 
Clearing CD permits the cache hardware to place 
data from the associated page into caches. The CD 
bit of page directory entries is not referenced by the 
processor, but is reserved. 

To control external caches, the i860 XR microproc- 
essor outputs on its PTB pin either the CD or WT bit. 
The PBM bit of epsr determines which bit is output. 

2.4.4.6 Accessed and Dirty Bits 

The A (accessed) and D (dirty) bits provide data 
about page usage in both levels of the page tables. 

The i860 XR microprocessor sets the corresponding 
accessed bits in both levels of page tables before a 
read or write operation to a page. The processor 
tests the dirty bit in the second-level page table be- 
fore a write to an address covered by that page table 
entry, and, under certain conditions, causes traps. 
The trap handler then has the opportunity to main- 
tain appropriate values in the dirty bits. The dirty bit 
in directory entries is not tested by the i860 XR mi- 
croprocessor. The precise algorithm for using these 
bits is specified in Section 2.4.5. 

An operating system that supports paged virtual 
memory can use these bits to determine what pages 
to eliminate from physical memory when the de- 
mand for memory exceeds the physical memory 
available. The D and A bits in the PTE (page-table 
entry) are normally initialized to zero by the operat- 
ing system. The processor sets the A bit when a 
page is accessed either by a read or write operation. 
When a data- or instruction-access fault occurs, the 
trap handler sets the D bit if an allowable write is 
being performed, then re-executes the instruction. 

The operating system is responsible for coordinating 
its updates to the accessed and dirty bits with up- 
dates by the CPU and by other processors that may 
share the page tables. The i860 XR microprocessor 
automatically asserts the LOCK# signal while set- 
ting the A bit. If an A-bit of a PTE is found not set 
during a locked sequence (created by the lock in- 
struction), a trap will occur and the processor will not 
update the A-bit. 

2.4.4.7 Combining Protection of Both Levels of 
Page Tables 

For any one page, the protection attributes of its 
page directory entry may differ from those of its 
page table entry. The i860 XR microprocessor com- 
putes the effective protection attributes for a page 
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by examining the protection attributes in both the 
directory and the page table. Table 2.6 shows the 
effective protection provided by the possible combi- 
nations of protection attributes. 

2.4.5 ADDRESS TRANSLATION ALGORITHM 

The algorithm below defines the translation of each 
virtual address to a physical address. Let DIR, 
PAGE, and OFFSET be the fields of the virtual ad- 
dress; let PFA1 and PFA2 be the page frame ad- 
dress fields of the first and second level page tables 
respectively; DTB is the page directory table base 
address stored in the dirbase register. 

1 . Read the PTE (page table entry) at the physical 
address formed by DTB:DIR:00. 

2. If P in the PTE is zero, generate a data- or instruc- 
tion-access fault. 

3. If W in the PTE is zero, the operation is a write, 
and either the U-bit of the PSR is set or WP = 1, 
generate a data or instruction access fault. 

4. If the U-bit in the PTE is zero and the U-bit in the 
psr is set, generate a data or instruction access 
fault. 

5. If A in the PTE is zero, and if the TLB miss oc- 
curred while the bus was locked, generate a 



data or instruction access fault. (The trap allows 
software to set A to one and restart the se- 
quence. This avoids ambiguity in determining 
what address corresponds to a locked sema- 
phore for external bus hardware use.) 

6. If A in the PTE is zero, and if the TLB miss oc- 
curred while the bus was not locked, assert 
LOCK#. Re-fetch and check the PTE, set A, and 
store the PTE. Deassert LOCK# during the store. 

7. Locate the PTE at the physical address formed by 
PFA1:PAGE:00. 

8. Perform the P, W, U, and A checks as in steps 2 
through 6 with the second-level PTE. 

9. If D in the PTE is clear and the operation is a 
write, generate a data or instruction access fault. 

10. Form the physical address as PFA2:OFFSET. 

The i860 XR microprocessor looks only in external 
memory for Page Directories and Page Tables, in 
the translation process. The data cache is not 
searched. Therefore, any code which modifies Page 
Directories or Page Tables must keep them out of 
the cache. The tables should be kept in non-cache- 
able memory, or flushed from the cache. 






Table 2.6. Combining Directory and Page Protections 




Page Directory 
Entry 


Page Table 
Entry 


Combined Protection 


User 
Access 


Supervisor 
Access 


U-bit 


W-bit 


U-bit 


W-bit 


WP = X 


WP = 


WP = 1 














N 


R/W 


R 











1 


N 


R/W 


R 








1 





N 


R/W 


R 








1 


1 


N 


R/W 


R 





1 








N 


R/W 


R 





1 





1 


N 


R/W 


R/W 





1 


1 





N 


R/W 


R 





1 


1 


1 


N 


R/W 


R/W 


1 











N 


R/W 


R 


1 








1 


N 


R/W 


R 


1 





1 





R 


R/W 


R 


1 





1 


1 


R 


R/W 


R 


1 


1 








N 


R/W 


R 


1 


1 





1 


N 


R/W 


R/W 


1 . 


1 


1 





R 


R/W 


R . 


1 


1 


1 


1 


R/W 


R/W 


R/W 



NOTES: 

N = No access allowed 
R = Read access only 



R/W = Both reads and writes allowed 
X = Don't care 



2-183 



iny 



J860TM XR MICROPROCESSOR 



o^iyiM^f 



The i860 XR microprocessor expects Page Directo- 
ries and Page Tables to be in little endian format. 
The operating system must maintain these tables in 
little endian format by either setting BE = when 
manipulating the tables or by complementing bit 2 of 
the address when loading or storing entries. 

2.4.6 ADDRESS TRANSLATION FAULTS 

The address translation fault is one instance of the 
data-access fault. The instruction causing the fault 
can be re-executed upon returning from the trap 
handler. 

2.4.7 PAGE TRANSLATION CACHE 

For greatest efficiency in address translation, the 
i860 XR microprocessor stores the most recently 
used page-table data in an on-chip cache called the 
TLB (translation lookaside buffer). Only if the neces- 
sary paging information is not in the cache must 
both levels of page tables be referenced. 

2.5 Caching and Cache Flushing 

The i860 XR microprocessor has the ability to cache 
instruction, data, and address-translation informa- 
tion in on-chip caches. Caching uses virtual-address 
tags. The effects of mapping two different virtual ad- 
dresses in the same address space to the same 
physical address are undefined. 

Instruction, data, and address-translation caching on 
the i860 XR microprocessor are not transparent. Be- 
cause the data cache uses a write-back protocol, 
writes do not immediately update memory, and 
writes to memory by other bus devices do not up- 
date the cache. Changes to page tables do not auto- 
matically update the TLB, and changes to instruc- 
tions do not automatically update the instruction 
cache. Under certain circumstances, such as I/O 
references, self-modifying code, page-table up- 
dates, or shared data in a multiprocessing system, it 
is necessary to bypass or to flush the caches. The 
i860 XR microprocessor provides the following 
methods for doing this: 

• Bypassing Instruction and Data Caches. If 

deasserted during cache-miss processing, the 
KEN# pin disables instruction and data caching 
of the referenced data. If the CD bit of the associ- 
ated second-level PTE is set, caching of data and 
instructions is disabled. The i860 XR CPU may 
place pages having WT — 1 into the instruction 



cache. Future implementations of the i860 XR ar- 
chitecture may adhere to a write-through data 
cache policy. Thus, they may cache pages having 
the WT bit of the PTE set. The value of the CD bit 
or the WT bit is output on the PTB pin for use by 
external caches. 

Invalidating Instruction and Address-Transla- 
tion Caches. Storing to the dirbase register with 
the ITI bit set invalidates the contents of the in- 
struction and address-translation caches. This bit 
should be set when modifying a page table, when 
modifying a page containing instructions, or when 
changing the DTB field of dirbase or the WP bit 
of the epsr. Note that in order to make the in- 
struction or address-translation caches consist- 
ent with the data cache, the data cache must be 
flushed before invalidating the other caches. 

NOTE: 

The mapping of the page containing the 
currently executing instruction and the 
next six instructions should not be differ- 
ent in the new page tables when st.c dir- 
base changes DTB or activates ITI. The 
six instructions following the st.c should 
be nops and should lie in the same page 
as the st.c. 

Flushing the Data Cache. The data cache is 
flushed by a software routine using the flush in- 
struction. The data cache must be flushed prior to 
invalidating the instruction or address-translation 
caches (as controlled by the ITI bit of dirbase) or 
enabling or disabling address translation (via the 
ATE bit). The data cache does not need flushing 
if the program is modifying only the P, U, W, A, or 
D bits of a PTE (as long as the Page Frame Ad- 
dress is not changed and the PTE itself was not 
in the data cache.) The i860 XR CPU does not 
check these protection bits on cache line write- 
back. Thus, a trap handler can service a DAT for 
D-bit-zero by setting D = 1 and then ITI = 1. In 
the case of setting the P or A bits active, there is 
no need to invalidate or flush any caches be- 
cause the processor does not load entries into 
the TLB that have P = or A = 0. The i860 XR 
microprocessor searches only external memory 
for Page Directories and Page Tables in the 
translation process. The data cache is not 
searched. Therefore, Page Tables and Directo- 
ries should be kept in non-cacheable memory, or 
flushed from the cache by any code which ac- 
cesses them. 
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2.6 Instruction Set 

Table 2.7 shows the complete set of instructions 
grouped by function within processing unit. Refer to 
Section 8 for an algorithmic definition of each in- 
struction. 

The architecture of the i860 XR microprocessor 
uses parallelism to increase the rate at which opera- 
tions may be introduced into the unit. Parallelism in 
the i860 XR microprocessor is not transparent; rath- 
er, programmers have complete control over paral- 
lelism and therefore can achieve maximum perform- 
ance for a variety of computational problems. 

2.6.1 PIPELINED AND SCALAR OPERATIONS 

One type of parallelism used within the floating-point 
unit is "pipelining". The pipelined architecture treats 
each operation as a series of more primitive opera- 
tions (called "stages") that can be executed in par- 
allel. Consider just the floating-point adder unit as an 
example. Let A represent the operation of the adder. 
Let the stages be represented by Ai, A2, and A3. 
The stages are designed such that Aj + 1 for one ad- 
der instruction can execute in parallel with A-, for the 
next adder instruction. Furthermore, each A,- can be 
executed in just one clock. The pipelining within the 
multiplier and graphics units can be described simi- 
larly, except that the number of stages may be differ- 
ent. 

Figure 2.12 illustrates three-stage pipelining as 
found in the floating-point adder (also in the floating- 
point multiplier when single-precision input operands 
are employed). The columns of the figure represent 
the three stages of the pipeline. Each stage holds 
intermediate results and also (when introduced into 
first stage by software) holds status information per- 
taining to those results. The figure assumes that the 
instruction stream consists of a series of consecu- 
tive floating-point instructions, all of one type (i.e. all 
adder instructions or all single-precision multiplier in- 
structions). The instructions are represented as i, 
1+1, etc. The rows of the figure represent the states 
of the unit at successive clock cycles. Each time a 
pipelined operation is performed, the result of the 
last stage of the pipeline is stored in the destination 
register fdest, the pipeline is advanced one stage, 
and the input operands fsrrf and fsrc2 are trans- 
ferred to the first stage of the pipeline. 



In the i860 XR microprocessor, the number of pipe- 
line stages ranges from one to three. A pipelined 
operation with a three-stage pipeline stores the re- 
sult of the third prior operation. A pipelined operation 
with a two-stage pipeline stores the result of the sec- 
ond prior operation. A pipelined operation with a 
one-stage pipeline stores the result of the prior oper- 
ation. 

There are four floating-point pipelines: one for the 
multiplier, one for the adder, one for the graphics 
unit, and one for floating-point loads. The adder 
pipeline has three stages. The number of stages in 
the multiplier pipeline depends on the precision of 
the source operands in the pipeline. Single precision 
has three stages and double precision has two 
stages. The graphics unit has one stage for all preci- 
sions. The load pipeline has three stages for all pre- 
cisions. 

Changing the FZ (flush zero), RM (rounding mode), 
or RR (result register) bits of fsr while there are re- 
sults in either the multiplier or adder pipeline produc- 
es effects that are not defined. 



2.6.1.1 Scalar Mode 

In addition to the pipelined execution mode, the i860 
XR microprocessor also can execute floating-point 
instructions in "scalar" mode. Most floating-point in- 
structions have both pipelined and scalar variants, 
distinguished by a bit in the instruction encoding. In 
scalar mode, the floating-point unit does not start a 
new operation until the previous floating-point oper- 
ation is completed. The scalar operation passes 
through all stages of its pipeline before a new opera- 
tion is introduced, and the result is stored automati- 
cally. Scalar mode is used when the next operation 
depends on results from the previous few floating- 
point operations (or when the compiler or program- 
mer does not want to deal with pipelining). 

2.6.1.2 Pipelining Status Information 

Result status information in the fsr consists of the 
AA, AI, AO, AU, and AE bits, in the case of the ad- 
der, and the MA, Ml, MO, and MU bits, in the case of 
the multiplier. This information arrives at the fsr via 
the pipeline in one of two ways: 
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Table 2.7. Instruction Set 



Core Unit 


Mnemonic 


Description 


Load and Store Instructions 


Id.x 


Load integer 


st.x 


Store integer 


fld.y 


F-Pload 


pfld.z 


Pipelined F-P load 


fst.y 


F-P store 


pst.d 


Pixel store 


Register to Register Moves 


ixfr 


Transfer integer to F-P register 


Integer Arithmetic Instructions 


addu 


Add unsigned 


adds 


Add signed 


subu 


Subtract unsigned 


subs 


Subtract signed 


Shift Instructions 


shl 


Shift left 


shr 


Shift right 


shra 


Shift right arithmetic 


shrd 


Shift right double 


Logical Instructions 


and ^■ 


Logical AND 


andh 


Logical AND high 


andnot 


Logical AND NOT 


andnoth 


Logical AND NOT high 


or 


Logical OR 


orh 


Logical OR high 


xor 


Logical exclusive OR 


xorh 


Logical exclusive OR high 


Control-Transfer Instructions 


trap 


Software trap 


intovr 


Software trap on integer overflow 


br 


Branch direct 


bri 


Branch indirect 


be 


Branch on CC 


bet 


Branch on CC taken 


bnc 


Branch on not CC 


bnc.t 


Branch on not CC taken 


bte 


Branch if equal 


btne 


Branch if not equal 


Ma- 


Branch on LCC and add 


caH 


Subroutine call 


calli 


Indirect subroutine call 


System Control Instructions 


flush 


Cache flush 


Id.c 


Load from control register 


st.c 


Store to control register 


lock 


Begin interlocked sequence 


unlock 


End interlocked sequence 



Floating-Point Unit 


Mnemonic 


Description 


Register to Register Moves 


fxfr 


Transfer F-P to integer register 


F-P Multiplier Instruction 


fmul.p 


F-P multiply 


pfmul.p 


Pipelined F-P multiply 


pfmul3.dd 


3-Stage pipelined F-P multiply 


fmlow.p 


F-P multiply low 


frcp.p 


F-P reciprocal 


frsqr.p 


F-P reciprocal square root 


F-P Adder Instructions 


fadd.p 


F-P add 


pfadd.p 


Pipelined F-P add 


famov.r 


F-P adder move 


pfamov.r 


Pipelined F-P adder move 


fsub.p 


F-P subtract 


pfsub.p 


Pipelined F-P subtract 


pfgt.p 


Pipelined F-P greater-than compare 


pfeq.p 


Pipelined F-P equal compare 


fix.p 


F-P to integer conversion 


pfix.p 


Pipelined F-P to integer conversion 


ftrunc.p 


F-P to integer truncation 


pftrunc.p 


Pipelined F-P to integer truncation 


Dual-Operation Instructions 


pfam.p 


Pipelined F-P add and multiply 


pfsm.p 


Pipelined F-P subtract and multiply 


pfmam.p 


Pipelined F-P multiply with add 


pfmsm.p 


Pipelined F-P multiply with subtract 


Long Integer Instructions 


fisub.z 


Long-integer subtract 


pfisub.z 


Pipelined long-integer subtract 


fiadd.z 


Long-integer add 


pfiadd.z 


Pipelined long-integer add 


Graphics Instructions 


fzchks 


1 6-bit Z-buffer check 


pfzchks 


Pipelined 1 6-bit Z-buffer check 


fzchkl 


32-bit Z-buffer check 


pfzchkl 


Pipelined 32-bit Z-buffer check 


faddp 


Add with pixel merge 


pfaddp 


Pipelined add with pixel merge 


faddz 


Add with Z merge 


pfaddz 


Pipelined add with Z merge 


form 


OR with MERGE register 


pform 


Pipelined OR with MERGE register 



Assembler Pseudo-Operations 


Mnemonic 


Description 


mov 


Integer register-register move 


fmov.r 


F-P reg-reg move 


pfmov.r 


Pipelined F-P reg-reg move 


nop 


Core no-operation 


fnop 


F-P no-operation 


pfle.p 


Pipelined F-P less-than or equal 
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Figure 2.12. Pipelined Instruction Execution 



1 . It is calculated by the last stage of the pipeline. 
This is the normal case. 

2. It is propagated from the first stage of the pipe- 
line. This method is used when restoring the state 
of the pipeline after a preemption. When a store 
instruction updates the fsr and the value of the 
U bit in the word being written into the fsr is set, 
the store updates the result status bits in the first 
stage of both the adder and multiplier pipelines. 
When software changes the result-status bits of 
the first stage of a particular unit (multiplier or ad- 
der), the updated result-status bits are propagat- 
ed one stage for each pipelined floating-point op- 
eration for that unit. In this case, each stage of the 
adder and multiplier pipelines holds its own copy 
of the relevant bits of the fsr. When they reach 
the last stage, they override the normal result- 
status bits computed from the last stage result. 



At the next floating-point instruction (or, at certain 
core instructions), after the result reaches the last 
stage, the i860 XR microprocessor traps if any of the 
status bits of the fsr indicate exceptions. Note that 
the instruction that creates the exceptional condition 
is not the instruction at which the trap occurs. 

2.6.1.3 Precision in the Pipelines 

In pipelined mode, when a floating-point operation is 
initiated, the result of an earlier pipelined floating- 
point operation is returned. The result precision of 
the current instruction applies to the operation being 
initiated. The precision of the value stored in fdest is 
that which was specified by the instruction that initia- 
ted that operation. 
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Figure 2.13. Dual-Instruction Mode Transitions 



If fdest "\s the same as fsrd or fsrc2, the value being 
stored in fdest is used as the input operand. In this 
case, the precision of fdest must be the same as the 
source precision. 

The multiplier pipeline has two stages when the 
source operand is double-precision and three stages 
when the precision of the source operand is single. 
This means that a pipelined multiplier operation 
stores the result of the second previous multiplier 
operation for double-precision inputs and third previ- 
ous for single-precision inputs (except when chang- 
ing precisions). 

2.6.1.4 Transition between Scalar and Pipelined 
Operations 

When a scalar operation is executed, it passes 
through all stages of the pipeline; therefore, any un- 
stored results in the affected pipeline are lost. To 
avoid losing information, the last pipelined opera- 
tions before a scalar operation should be dummy 
pipelined operations that unload unstored results 
from the affected pipeline. 



After a scalar operation, the values of all pipeline 
stages of the affected unit (except the last) are un- 
defined. No spurious result-exception traps result 
when the undefined values are subsequently stored 
by pipelined operations; however, the values should 
not be referenced as source operands. 

For best performance a scalar operation should not 
immediately precede a pipelined operation whose 
fdest is nonzero. 



2.6.2 DUAL-INSTRUCTION MODE 

Another form of parallelism results from the fact that 
the i860 XR microprocessor can execute both a 
floating-point and a core instruction simultaneously. 
Such parallel execution is called dual-instruction 
mode. When executing in dual-instruction mode, the 
instruction sequence consists of 64-bit aligned in- 
structions with a floating-point instruction in the low- 
er 32 bits and a core instruction in the upper 32 bits. 
Table 2.7 identifies which instructions are executed 
by the core unit and which by the floating-point unit. 
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Programmers specify dual-instruction mode either 
by including in the mnemonic of a floating-point in- 
struction a d. prefix or by using the Assembler direc- 
tives .dual . . . .enddual. Both of the specifications 
cause the D-bit of floating-point instructions to be 
set. If the i860 XR microprocessor is executing in 
single-instruction mode and encounters a floating- 
point instruction with the D-bit set, one more 32-bit 
instruction is executed before dual-mode execution 
begins. If the i860 XR microprocessor is executing in 
dual-instruction mode and a floating-point instruction 
is encountered with a clear D-bit, then one more pair 
of instructions is executed before resuming single-in- 
struction mode. Figure 2.13 illustrates two variations 
of this sequence of events: one for extended se- 
quences of dual-instructions and one for a single in- 
struction pair. 

When a 64-bit dual-instruction pair sequentially fol- 
lows a delayed branch instruction in dual-instruction 
mode, both 32-bit instructions are executed. 

2.6.3 DUAL-OPERATION INSTRUCTIONS 

Special dual-operation floating-point instructions 
(add-and-multiply, subtract-and-multiply) use both 
the multiplier and adder units within the floating- 
point unit in parallel to efficiently execute such com- 
mon tasks as evaluating systems of linear equa- 
tions, performing the Fast Fourier Transform (FFT), 
and performing graphics transformations. 

The instructions pfam fsrd, fsrc2, fdest (add and 
multiply), pfsm fsrd, fsrc2, fdest (subtract and mul- 
tiply), pfmam fscrl, fsrc2, fdest (multiply and add), 
and pfmsm fsrd, fsrc2, fdest (multiply and subtract) 
initiate both an adder operation and a multiplier op- 
eration. Six operands are required, but the instruc- 
tion format specifies only three operands; therefore, 
there are special provisions for specifying the oper- 
ands. These special provisions consist of: 

© Three special registers (KR, Kl, and T), that can 
store values from one dual-operation instruction 
and supply them as inputs to subsequent dual- 
operation instructions. 

1 . The constant registers KR and Kl can store the 
value of fsrd and subsequently supply that 
value to the multiplier pipeline in place of fsrd. 

2. The transfer register T can store the last stage 
result of the multiplier pipeline and subse- 
quently supply that value to the adder pipeline 
in place of fsrd. 

■ • A four-bit data-path control field in the opcode 
(DPC) that specifies the operands and loading of 
the special registers. 

1. Operand-1 of the multiplier can be KR, Kl, or 
fsrd. 

2. Operand-2 of the multiplier can be fsrc2 or the 
last stage result of the adder pipeline. 



3. Operand-1 of the adder can be fsrd, the 
T-register, or the last stage result of the adder 
pipeline. 

4. Operand-2 of the adder can be fsrc2, the last 
stage result of the multiplier pipeline, or the 
last stage result of the adder pipeline. 

Figure 2.14 shows all the possible data paths sur- 
rounding the adder and multiplier. A DPC field in 
these instructions select different data paths. Table 
8.8 shows the various encodings of the DPC field. 
Refer to Dual Operation Instructions section in the 
i860 Microprocessor Programmer's Reference Man- 
ual for pictorial description. 



SRC2 RDEST 




MULTIPLIER UNIT 



RESULT 

~1~ 




Figure 2.14. Dual-Operation Data Paths 

Note that the mnemonics pfam.p, pfsm.p, 
pfmam.p, and pfmsm. p are never used as such in 
the assembly language; these mnemonics are used 
here to designate classes of related instructions. 
Each value of DPC has a unique mnemonic associ- 
ated with it. 



2.7 Addressing Modes 

Data access is limited to load and store instructions. 
Memory addresses are computed from two fields of 
load and store instructions: isrd and isrc2. 

1. isrd either contains the identifier of a 32-bit inte- 
ger register or contains an immediate 16-bit ad- 
dress offset. 



2. isrc2 always specifies a register. 
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Table 2.8. Types of Traps 



Type 


Indication 


Caused by 


PSR,EPSR 


FSR 


Condition 


Instruction 


Instruction 
Fault 


IT OF 
IL 




Software traps 
Missing unlock 


trap, intovr 

Any 


Floating 
Point 
Fault 


FT 


SE 

AO, MO 
AU, MU 
Al, Ml 


Floating-point source exception 
Floating-point result exception 

overflow 

underflow 

inexact result 


Any M- or A-unit except fmlow 
Any M- or A-unit except fmlow, pfgt, 
and pfeq. Reported on any F-P 
instruction plus pst, fst, and 
sometimes fid, pfld, ixfr 


Instruction 
Access Fault 


IAT 




Address translation exception 
during instruction fetch 


Any 


Data Access 
Fault 


DAT* 




Load/store address translation 

exception 
Misaligned operand address 
Operand address matches 

db register 


Any load/store 

Any load/store 
Any load/store 


Interrupt 


IN 


External interrupt 


Reset 


No trap bits set 


Hardware RESET signal 



NOTES: 

These cases can be distinguished by examining the operand addresses. 

The IL bit of the epsr must be checked by the trap handler to tell if the bus is currently in a locked sequence. 



Because either isrd or isrc2 may be null (zero), a 
variety of useful addressing modes result: 

offset + register Useful for accessing fields within 
a record, where register points 
to the beginning of the record. 
Useful for accessing items in a 
stack frame, where register is 
r3, the register used for pointing 
to the beginning of the stack 
frame. 

register + register Useful for two-dimensional ar- 
rays or for array access within 
the stack frame. 



register 
offset 



Useful as the end result of any 
arbitrary address calculation. 

Absolute address into the first or 
last 32K of the logical address 
space. 



In addition, the floating-point load and store instruc- 
tions may select autoincrement addressing. In this 
mode isrc2 is replaced by the sum of isrd and isrc2 
after performing the load or store. This mode makes 
stepping through arrays more efficient, because it 
eliminates one address-calculation instruction. 



2.8 Traps and Interrupts 

Traps are caused by exceptional conditions detect- 
ed in programs or by external interrupts. Traps 
cause interruption of normal program flow to exe- 



cute a special program known as a trap handler. 
Traps are divided into the types shown in Table 2.8. 
Interrupts and traps start execution in single instruc- 
tion mode at virtual address OxFFFFFFOO in supervi- 
sor level (U = 0). 

2.8.1 TRAP HANDLER INVOCATION 

This section applies to traps other than reset. When 
a trap occurs, execution of the current instruction is 
aborted. The instruction is restartable. The proces- 
sor takes the following steps while transferring con- 
trol to the trap handler: 

1 . Copies U (user mode) of the psr into PU (previous 
U). 

2. Copies IM (interrupt mode) into PI M (previous IM). 

3. Sets U to zero (supervisor mode). 

4. Sets IM to zero (interrupts disabled). 

5. If the processor is in dual instruction mode, it sets 
DIM; otherwise it clears DIM. 

6. If the processor is in single-instruction mode and 
the next instruction will be executed in dual- 
instruction mode or if the processor is in dual-in- 
struction mode and the next instruction will be 
executed in single-instruction mode, DS is set; 
otherwise, it is cleared. 

7. The appropriate trap type bits in psr are set (IT, 
IN, IAT, DAT, FT). Several bits may be set if the 
corresponding trap conditions occur simulta- 
neously. 
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8. An address is placed in the fault instruction regis- 
ter (fir) to help locate the trapped instruction. In 
single-instruction mode, the address in fir is the 
address of the trapped instruction itself. In dual-in- 
struction mode, the address in fir is that of the 
floating-point half of the dual instruction. If an in- 
struction or data access fault occurred, the asso- 
ciated core instruction is the high-order half of the 
dual instruction (fir + 4). In dual-instruction 
mode, when a data access fault occurs in the ab- 
sence of other trap conditions, the floating-point 
half of the dual instruction will already have been 
executed. 

The processor begins executing the trap handler 
by transferring execution to virtual address 
OxFFFFFFOO. The trap handler begins execution in 
single-instruction mode. The trap handler must ex- 
amine the trap-type bits in psr (IT, IN, IAT, DAT, FT) 
to determine the cause or causes of the trap. 

2.8.2 INSTRUCTION FAULT 

This fault is caused by any of the following condi- 
tions. In all cases the processor sets the IT bit be- 
fore entering the trap handler. 

1 . By the trap instruction. When trap is executed in 
dual-instruction mode, the floating-point compan- 
ion of the trap instruction is not executed before 
the trap is taken. 

2. By the intovr instruction. The trap occurs only if 
OF in epsr is set when intovr is executed. The 
trap handler should clear OF before returning. 
When intovr causes a trap in dual-instruction 
mode, the floating-point companion of the intovr 
instruction is completely executed before the trap 
is taken. 

3. By violation of lock/unlock protocol, explained be- 
low. (Note that trap and intovr should not be 
used within a locked sequence; otherwise, it 
would be difficult to distinguish between this and 
the prior cases.) 

The lock protocol requires the following sequence 
of activities: 

Llock 

2. Any load or store instruction that misses the 
cache 

3. unlock 

4. Any load or store instruction (regardless of 
whether it misses the cache) 

There may be other instructions between any of 
these steps. The bus is locked after step 2, and re- 
mains locked until step 4. Step 4 must follow step 1 
by 30 instructions or less, otherwise the instruction 
trap occurs. In case of a trap, IL is also set. If the 
load or store instruction in step 2 hits the cache, the 
sequence is legal, but the bus is not locked. 



2.8.3 FLOATING-POINT FAULT 

The floating-point fault is reported on floating-point 
instructions, pst, fst, and sometimes fid, pfld, ixfr. 
The floating-point faults of the i860 XR microproces- 
sor support the floating-point exceptions defined by 
the IEEE standard as well as some other useful 
classes of exceptions. The i860 XR microprocessor 
divides these into two classes: source exceptions 
and result exceptions. The numerics library supplied 
by Intel provides the IEEE standard default handling 
for all these exceptions. 

2.8.3.1 Source Exception Faults 

When used as inputs to the multiplier or adder, all 
exceptional operands, including infinities, denormal- 
ized numbers and NaNs, cause a floating-point fault 
and set SE in the fsr. Source exceptions are report- 
ed on the instruction that initiates the operation. For 
pipelined operations, the pipeline is not advanced. 

The SE value is undefined for faults on fid, pfld, fst, 
pst, and ixfr instructions when in single-instruction 
mode or when in dual-instruction mode and the com- 
panion instruction is not a multiplier or adder opera- 
tion. 

2.8.3.2 Result Exception Faults 

The class of result exceptions includes any of the 

following conditions: 

° Overflow. The absolute value of the rounded 
true result would exceed the largest positive finite 
number in the destination format. 

o Underflow (when FZ is clear). The absolute val- 
ue of the rounded true result would be smaller 
than the smallest positive finite number in the 
destination format. 

° Inexact result (when Tl is set). The result is not 
exactly representable in the destination format. 
For example, the fraction 1 / 3 cannot be precisely 
represented in binary form. This exception occurs 
frequently and indicates that some (generally ac- 
ceptable) accuracy has been lost. 

The point at which a result exception is reported de- 
pends upon whether pipelined operations are being 
used: 

• Scalar (nonpipelined) operations. Result ex- 
ceptions are reported on the next floating-point, 
fst.x, or pst.x (and sometimes fid, pfld, ixfr) in- 
struction after the scalar operation. When a trap 
occurs, the last stage of the affected unit con- 
tains the result of the scalar operation. 

• Pipelined operations. Result exceptions are re- 
ported when the result is in the last stage and the 
next floating-point, fst.x or pst.x (and sometimes 
fid, pfld, ixfr) instruction is executed. When a 
trap occurs, the pipeline is not advanced, and the 
last stage results (that caused the trap) remain 
unchanged. 
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When no trap occurs (either because FTE is clear or 
because no exception occurred), the pipeline is ad- 
vanced normally by the new floating-point operation. 

The result-status bits of the affected unit are unde- 
fined until the point that result exceptions are report- 
ed. At this point, the last stage result-status bits (bits 
29..22 and 16..9 of the fsr) reflect the values in the 
last stages of both the adder and multiplier. For ex- 
ample, if the last stage result in the multiplier has 
overflowed and a pipelined floating-point pfadd is 
started, a trap occurs and MO is set. 

For scalar operations, the RR bits of fsr specify the 
register in which the result was stored. RR is updat- 
ed when the scalar instruction is initiated. The trap, 
however, occurs on a subsequent instruction. Pro- 
grammers must prevent intervening stores to fsr 
from modifying the RR bits. Prevention may take one 
of the following forms: 

• Before any store to fsr when a result exception 
may be pending, execute a dummy floating-point 
operation to trigger the result-exception trap. 

• Always read from fsr before storing to it, and 
mask updates so that the RR bits are not 
changed. 

For pipelined operations, RR is cleared and the re- 
sult is in the last stage of the pipeline of the appro- 
priate unit. The trap handler must flush the pipeline, 
saving the results and the status bits. 

In either pipelined or scalar mode, the trap handler 
must then compute the trapping result. In either 
case, the result has the same fraction as the true 
result and has an exponent which is the low-order 
bits of the true result. The trap handler can inspect 
the result, compute the result appropriate for that 
instruction (a NaN or an infinity, for example), and 
store the correct result. The result is either stored in 
the register specified by RR (if nonzero) or (if RR = 
0) the trap handler must reload the pipeline with the 
saved results and status bits. 

Result exceptions may be reported for both the ad- 
der and multiplier units at the same time. In this 
case, the trap handler should fix up the last stage of 
both pipelines. 

2.8.4 INSTRUCTION ACCESS FAULT 

This trap occurs during address translation for in- 
struction fetches in any of these cases: 

• The address fetched is in a page whose P (pres- 
ent) bit in the page table is clear (not present). 

• The address fetched is in a supervisor mode 
page, but the processor is in user mode. 

• The address fetched is in a page whose PTE has 
A = 0, and the access occurs during a locked 
sequence (i.e., between lock and unlock). 



Note that several instructions are fetched at one 
time, either due to instruction prefetching or to in- 
struction caching. Therefore, a trap handler can 
change from supervisor to user mode and continue 
to execute instructions fetched from a supervisor 
page. An instruction access trap occurs only when 
the next group of instructions is fetched from a su- 
pervisor page (up to eight instructions later). If, in the 
meantime, the handler branches to a user page, no 
instruction access trap occurs. No protection viola- 
tion results, because the processor does not permit 
data accesses to supervisor pages while running in 
user mode. 

2.8.5 DATA ACCESS FAULT 

This trap results from an abnormal condition detect- 
ed during data operand fetch or store. Such an ex- 
ception can be due only to one of the following caus- 
es: 

• An attempt is being made to write to a page 
whose D (Dirty) bit is clear. 

• A memory operand is misaligned (is not located 
at an address that is a multiple of the length of 
the data). 

• The address stored in the db register is equal to 
one of the addresses spanned by the operand. 

• The operand is in a not-present page. 

• An attempt is being made from user level to write 
to a read-only page or to access a supervisor-lev- 
el page. 

• The operand was in a page whose PTE had A = 
0, and the access occurred during a locked se- 
quence, (i.e., between lock and unlock.) 

• Write protection (determined by epsr bit WP = 1) 
is violated in supervisor mode. 

2.8.6 INTERRUPT TRAP 

An interrupt is an event that is signaled from an ex- 
ternal source. If the processor is executing with in- 
terrupts enabled (IM set in the psr), the processor 
sets the interrupt bit IN in the psr, and generates an 
interrupt trap. Vectored interrupts are implemented 
by interrupt controllers and software. 

2.8.7 RESET TRAP 

When the i860 XR microprocessor is reset, execu- 
tion begins in single-instruction mode at physical ad- 
dress OxFFFFFFOO. This is the same address as for 
other traps. The reset trap can be distinguished from 
other traps by the fact that no trap bits are set. The 
instruction cache is flushed. The bits DPS, BL, and 
ATE in dirbase are cleared. CSS is initialized by the 
value at the INT pin at the end of reset. The read- 
only fields of the espr are set to identify the proces- 
sor, while the IL, WP, and PBM bits are cleared. The 
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bits U, IM, BR, and BW in psr are cleared, as are the 
trap bits FT, DAT, IAT, IN, and IT. All other bits of 
psr and all other register contents are undefined. 

Refer to Table 2.9 for a summary of these initial set- 
tings. 

Table 2.9. Register and Cache Values after Reset 



Registers 


Initial Value 


Integer Registers 


Undefined 


Floating-Point 


Undefined 


Registers 




psr 


U, IM, BR, BW, FT, DAT, IAT, IN, 




IT = 0; others are undefined 


epsr 


IL.WP, PBM, BE = 0; 




Processor Type, Stepping 




Number, DCS are read 




only; others are undefined 


db 


Undefined 


dirbase 


DPS, BL, ATE = 0; others 




are undefined 


fir 


Undefined 


fsr 


Undefined 


KR, Kl, T, 


Undefined 


MERGE 




Caches 


Initial Value 


Instruction Cache 


Flushed 


Data Cache 


Undefined 


TLB 


Flushed 



The software must ensure that the data cache is 
flushed and control registers are properly initialized 
before performing operations that depend on the 
values of the cache or registers. The data cache has 
no "validity" bits, so memory accesses before the 
flush may result in false data cache hits. 

Reset code must initialize the floating-point pipeline 
state to zero with floating-point traps disabled to en- 
sure that no spurious floating-point traps are gener- 
ated. 

After a RESET the i860 XR microprocessor starts 
execution at supervisor level (U = 0). Before branch- 
ing to the first user-level instruction, the RESET trap 
handler or subsequent initialization code has to set 
PU and a trap bit so that an indirect branch instruc- 
tion will copy PU to U, thereby changing to user level. 

2.9 Debugging 

The i860 XR microprocessor supports debugging 
with both data and instruction breakpoints. The fea- 
tures of the i860 XR architecture that support debug- 
ging include: 

• db (data breakpoint register) which permits speci- 
fication of a data addresses that the i860 XR mi- 
croprocessor will monitor. 



BR (break read) and BW (break write) bits of the 
psr, which enable trapping of either reads or 
writes (respectively) to the address in db. 

DAT (data access trap) bit of the psr, which al- 
lows the trap handler to determine when a data 
breakpoint was the cause of the trap. 

trap instruction that can be used to set break- 
points in code. Any number of code breakpoints 
can be set. The values of the isrd and isrc2 
fields help identify which breakpoint has oc- 
curred. 

IT (instruction trap) bit of the psr, which allows 
the trap handler to determine when a trap 
instruction was the cause of the trap. 



3.0 HARDWARE INTERFACE 

In the following description of hardware interface, 
the # symbol at the end of a signal name indicates 
that the active or asserted state occurs when the 
signal is at a low voltage. When no # is present after 
the signal name, the signal is asserted when at the 
high voltage level. 



3.1 Signal Description 

Table 3.1 identifies functional groupings of the pins, 
lists every pin by its identifier, gives a brief descrip- 
tion of its function, and lists some of its characteris- 
tics. All output pins are tristate, except HLDA and 
BREQ. All inputs are synchronous, except HOLD 
and INT. 



3.1.1 CLOCK (CLK) 

The CLK input determines execution rate and timing 
of the i860 XR microprocessor. Timing of other sig- 
nals is specified relative to the rising edge of this 
signal. The i860 XR microprocessor can utilize a 
clock rate of 25 MHz, 33.3 MHz or 40 MHz. The 
internal operating frequency is the same as the ex- 
ternal clock. 



3.1.2 SYSTEM RESET (RESET) 

Asserting RESET for at least 16 CLK periods causes 
initialization of the i860 XR microprocessor. Refer to 
section 3.2 "Initialization" for more details related to 
RESET. 



3.1.3 BUS HOLD (HOLD) AND BUS HOLD 
ACKNOWLEDGE (HLDA) 

These pins are used for i860 XR microprocessor bus 
arbitration. At some clock after the HOLD signal is 
asserted, the i860 XR microprocessor releases con- 




2-193 



intei 



i860TM XR MICROPROCESSOR 



pimoiMMr 



Table 3.1. Pin Summary 



Pin 
Name 


Function 


Active 
State 


Input/ 
Output 


Execution Control Pins 


CLK 


CLocK 




I 


RESET 


System reset 


High 


I 


HOLD 


Bus hold 


High 


I 


HLDA 


Bus hold acknowledge 


High 





BREQ 


Bus request 


High 





INT/CS8 


Interrupt, code-size 


High 


I 


Bus Interface Pins 


A31-A3 


Address bus 


High 





BE7#-BE0# 


Byte Enables 


Low 





D63-D0 


Data bus 


High 


I/O 


LOCK# 


Bus lock 


Low 





W/R# 


Write/ Read bus cycle 


High/Low 





NENE# 


NExt NEar 


Low 





NA# 


Next Address request 


Low 


I 


READY# 


Transfer Acknowledge 


Low 


I 


ADS# 


ADdress Status 


Low 





Cache Interface Pins 


KEN# 


Cache ENable 


Low 


I 


PTB 


Page Table Bit 


High 





Testability Pins 


SHI 


Boundary Scan Shift Input 


High 


I 


BSCN 


Boundary Scan Enable 


High 


I 


SCAN 


Shift Scan Path 


High 


I 


Intel-Reserved Configuration Pins 


CC1-CC0 


Configuration 


High 


I 


Power and Ground Pins 


Vcc 


System power 






Vss 


System ground 







A # after a pin name indicates that the signal is active when at the low voltage level. 



trol of the local bus and puts all bus interface out- 
puts (except BREQ and HLDA) into a floating state, 
then asserts HLDA— all during the same clock peri- 
od. It maintains this state until HOLD is deasserted. 
Instruction execution stops only if required instruc- 
tions or data cannot be read from the on-chip in- 
struction and data caches. 

The time required to acknowledge a hold request is 
one clock plus the number of clocks needed to finish 
any outstanding bus cycles. HOLD is recognized 
even while RESET or LOCK# is asserted. 

When leaving a bus hold, the i860 XR microproces- 
sor deactivates HLDA and, in the same clock period, 
initiates a pending bus cycle, if any. 

Hold is an asynchronous input. 



3.1.4 BUS REQUEST (BREQ) 

This signal is asserted when the i860 XR microproc- 
essor has a pending memory request, even when 
HLDA is asserted. This allows an external bus arbi- 
ter to implement an "on demand only" policy for 
granting the bus to the i860 XR microprocessor. 
BREQ is asserted the clock after the i860 XR micro- 
processor realizes an internal request for the bus. In 
normal operation, BREQ goes low the clock after 
ADS# goes low for the final pending bus cycle. (Re- 
fer to Figure 4.10 for timing information.) During data 
or instuction cache fills, however, BREQ may be 
deasserted for one or more clocks, due to cache 
and TLB logic. 

3.1.5 INTERRUPT/CODE-SIZE (INT/CS8) 

This input allows interruption of the current instruc- 
tion stream. If interrupts are enabled (IM set in psr) 
when INT is asserted, the i860 XR microprocessor 
fetches the next instruction from address 
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OxFFFFFFOO. To assure that an interrupt is recog- 
nized, INT should remain asserted until the software 
acknowledges the interrupt (by writing, for example, 
to a memory-mapped port of an interrupt controller). 
When the bus is not locked, the maximum time be- 
tween the assertion of INT and the execution of the 
first instruction of the trap handler is ten clocks, plus 
the time for four sets of four pipelined read cycles 
and two sets of four pipelined writes (instruction- 
and data-cache misses and write-back cycles to up- 
date memory), plus the time for twenty nonpipelined 
read cycles (six TLB misses, with eight refetches 
when the A-bit is zero), plus the time for eight non- 
pipelined writes (updates to the A-bit). 

If the bus is locked from a lock instruction, the INT 
pin is ignored and the INT bit of epsr is always zero. 
The lock instruction can only assert LOCK# for 30- 
33 instructions before trapping. 

If INT is asserted during the clock before the falling 
edge of RESET, the eight-bit code-size mode is se- 
lected. For more about this mode, refer to section 
3.2 "Initialization". 

INT is an asynchronous input. 



The address and byte-enable pins are driven until 
either NA# or READY# is asserted. 



3.1.7 DATA PINS (D63-D0) 

The bus interface has 64 bidirectional data pins 
(D63-D0) to transfer data in eight- to 64-bit quanti- 
ties. Pins D7-D0 transfer the least significant byte; 
pins D63-D56 transfer the most significant byte. 

In read bus cycles, all 64 bits of the data bus are 
latched, even in CS8-mode instruction fetches when 
only the low-order eight bits are used. 

In write bus cycles, the point at which data is driven 
onto the bus depends on the type of the preceding 
cycle. If there was no preceding cycle (i.e. the bus 
was idle), data is dnven with the address. If the pre- 
ceding cycle was a write, data is driven as soon as 
READY # is returned from the previous cycle. If the 
preceding cycle was a read, data is driven one clock 
after READY # is returned from the previous cycle, 
thereby allowing time for the bus to be turned 
around. Data continues to be driven until READY # 
for the current cycle is returned. 




3.1.6 ADDRESS PINS (A31-A3) AND BYTE 
ENABLES (BE7#-BE0#) 

The 29-bit address bus (A31 -A3) identifies address- 
es to a 64-bit location. Separate byte-enable signals 
(BE7#-BE0#) identify which bytes should be ac- 
cessed within the 64-bit location. In all noncachea- 
ble read cycles (KEN# deasserted), the byte 
enables match the length and address of the re- 
quested data. Cacheable read cycles (KEN# assert- 
ed), however, result in four 64-bit memory cycles to 
fill an entire 32-byte cache line. The BE/?# pins acti- 
vated are those that represent the operand of the 
load instruction that caused the line fill, and these 
same BE/?# pins remain activated for all four cycles 
of the line fill. All 64 bits must be returned for each 
cycle without regard for the &En# signals. In all 
write cycles (noncacheable writes as well as cache 
line write-backs) the BE/7# signals indicate the 
bytes that must be written. 

Instruction fetches (W/R# is low) are distinguished 
from data accesses by the unique combinations of 
BE7#-BE0# defined in Table 3.2. For an eight-bit 
code fetch in eight-bit code-size (CS8) mode, 
BE2#-BE0# are redefined to be A2-A0 of the ad- 
dress. In this case BE7#-BE3# form the code 
shown in Table 3.2 that identifies an instruction 
fetch. The A2 in the table does not represent a phys- 
ical pin, just a conceptual internal address line value. 
The "x"under A2 for CS8 mode means "not applica- 
ble", or "don't care". All other combinations of byte 
enables indicate data accesses. 



3.1.8 BUS LOCK (LOCK#) 

This signal is used to provide atomic (indivisible) 
read-modify-write sequences in multiprocessor sys- 
tems. A multiprocessor bus arbiter must permit only 
one processor a locked access to the address which 
is on the bus when LOCK# first activates. The sys- 
tem must maintain the lock of that location until 
LOCK# deactivates. 

The i860 XR microprocessor coordinates the exter- 
nal LOCK# signal with the software-controlled BL 
bit of the dirbase register. Programmers do not 
have to be concerned about the fact that bus activity 
is not always synchronous with instruction execu- 
tion. LOCK# is asserted with ADS# for the address 
operand of the first load or store instruction execut- 
ed after the BL bit is set by the lock instruction. 
Pending bus cycles are locked according to the val- 
ue of the BL bit when the instruction was executed. 
Even if the BL bit is changed between the time that 
an instruction generates an internal bus request and 
the time that the cycle appears on the bus, the i860 
XR microprocessor still asserts LOCK# for that bus 
cycle. 

If ADS# is active when LOCK# deactivates, then 
that request should complete before the hardware 
relinquishes the lock. If ADS# is not active, the lock- 
ing of the location can immediately end when 
LOCK# deactivates. Of course the simplest arbitra- 
tion hardware can just lock the entire bus against all 
other accesses during LOCK# assertion through 
RDY# of the cycle in which LOCK# goes inactive. 
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Table 3.2. Identifying Instruction Fetches 


Code 
Fetch 


A2 


BE7# 


BE6# 


BE5# 


BE4# 


BE3# 


BE2# 


BE1# 


BE0# 


Normal 
(Non-CS8) 





1 


1 


1 


1 


1 





1 





Normal 
(Non-CS8) 


1 


1 





1 





1 


1 


1 


1 


CS8 
Mode 


X 


1 





1 





1 


Low-order address bits 



When the BL bit is deasserted with the unlock in- 
struction, LOCK# is deasserted with the next load 
or store but after any pending bus cycles. Between 
locked sequences, at least one cycle of no LOCK# 
is guaranteed by the behavior of the unlock instruc- 
tion. LOCK# deassertion may occur independently 
of ADS# for the case of a trap or a cache hit after 
unlock. 

The i860 XR microprocessor also asserts LOCK# 
during TLB miss processing for updates of the ac- 
cessed bit in page-table entries. The maximum time 
that LOCK# can be asserted in this case is five 
clocks plus the time required to perform a read-mod- 
ify-write sequence. Instruction fetches do not alter 
the LOCK # pin. 

Between lock and unlock instructions, the INT pin is 
ignored and the INT bit of epsr is zero when read by 
Id.c epsr. The time that interrupts are disabled is 
limited by the lock protocol outlined in Section 2.8.2. 

3.1.9 WRITE/READ BUS CYCLE (W/R#) 

This pin specifies whether a bus cycle is a read 
(LOW) or write (HIGH) cycle. It is driven until either 
NA# or READY # is asserted. 

3.1.10 NEXT NEAR (NENE#) 

This signal allows higher-speed reads and writes in 
the case of consecutive reads and writes that ac- 
cess static column or page-mode DRAMs. The i860 
XR microprocessor asserts NENE# when the cur- 
rent address is in the same DRAM page as the pre- 
vious bus cycle. The i860 XR microprocessor deter- 
mines the DRAM page size by inspecting the DPS 
field in the dirbase register. The page size can 
range from 2 9 to 2 16 64-bit words, supporting DRAM 
sizes from 256K x 1, 256K x 4, and up. NENE# is 
never asserted on the next bus cycle after HLDA is 
deasserted. 

3.1.11 NEXT ADDRESS REQUEST (NA#) 

NA# makes address pipelining possible. The sys- 
tem asserts NA# for at least one clock to indicate 
that it is ready to accept the next address from the 
i860 XR microprocessor. NA# may be asserted be- 



fore the current cycle ends. (If the system does not 
implement pipelining, NA# does not have to be acti- 
vated.) The i860 XR microprocessor samples NA# 
every clock, starting one clock after the prior activa- 
tion of ADS#. When NA# is active, the i860 XR 
microprocessor is free to drive address and bus-cy- 
cle definition for the next pending bus cycle. The 
i860 XR microprocessor remembers that NA# was 
asserted when no internal request is pending; there- 
fore, NA# can be deactivated after the next rising 
edge of the CLK signal. Up to three bus cycles can 
be outstanding simultaneously. 

3.1.12 TRANSFER ACKNOWLEDGE (READY#) 

The system must assert the READY # signal during 
read cycles when valid data is on the data pins and 
during write cycles when the system has accepted 
data from the data pins. READY # must be asserted 
for at least one clock. Sampling of READY # begins 
in the clock after an ADS# or in the second clock 
after a prior READY #. 

3.1.13 ADDRESS STATUS (ADS#) 

The i860 XR microprocessor asserts ADS# during 
the first clock of each bus cycle to identify the clock 
period during which it begins to assert outputs on 
the address bus. This signal is held active for one 
clock. 

3.1.14 CACHE ENABLE (KEN#) 

The i860 XR microprocessor samples KEN # to de- 
termine whether the data being read for the current 
cache-miss cycle is to be cached. This pin is inter- 
nally NORed with the CD and WT bits to control 
cacheability on a page by page basis (refer to Table 
3.3). 

If the address is one that is permitted to be in the 
cache, KEN # must be continuously asserted during 
the sampling period starting from the second rising 
clock edge after ADS# is asserted, through the 
clock NA# or READY# is asserted. The entire 64 
bits of the data bus will be used for the read, regard- 
less of the state of the byte-enable pins. Three addi- 
tional 64-bit bus cycles will be generated to fill the 
rest of the 32-byte cache block. 
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If KEN# is found deasserted at any clock from the 
clock after ADS# through the clock of the first NA# 
or READY #, the data being read will not be cached 
and two scenarios can occur: 1) if the cycle is due to 
data-cache miss, no subsequent cache-fill cycles 
will be generated; 2) if the cycle is due to an instruc- 
tion-cache miss, additional cycle(s) will be generat- 
ed until the address reaches a 32-byte boundary. To 
avoid caching a line, external hardware must deas- 
sert KEN# during or before the first NA# or 
READY #. 



3.1.15 PAGE TABLE BIT (PTB) 

Depending on the setting of the PBM (page-table bit 
mode) bit of the epsr, the PTB reflects the value of 
either the CD (cache disable) bit or the WT (write 
through) bit of the page-table entry used for the cur- 
rent cycle. When paging is disabled, PTB remains 
inactive. 

Table 3.3. Cacheability based on 
KEN#andCDORWT 



CD OR WT 


KEN# 


Meaning 





1 

1 



1 


1 


Cacheable access 
Noncacheable access 
Noncacheable page 
Noncacheable page 



3.1.16 BOUNDARY SCAN SHIFT INPUT (SHI) 

This pin is used with the testability features. Refer to 
section 3.3. 



3.1.17 BOUNDARY SCAN ENABLE (BSCN) 

This pin is used with the testability features. Refer to 
section 3.3. 



3.1.18 SHIFT SCAN PATH (SCAN) 

This pin is used with the testability features. Refer to 
section 3.3. 



3.1.19 CONFIGURATION (CC1-CC0) 

These two pins are reserved by Intel. Strap both pins 
LOW. 



3.1.20 SYSTEM POWER (V CC ) AND GROUND 
(V S S) 

The i860 XR microprocessor has 48 pins for power 
and ground. All pins must be connected to the ap- 
propriate low-inductance power and ground signals 
in the system. 



3.2 Initialization 

Initialization of the i860 XR microprocessor is 
caused by assertion of the RESET signal for at least 
16 clocks. Table 3.4 shows the status of output pins 
during the time that RESET is asserted. Note that 
HOLD requests are honored during RESET and that 
the status of output pins depends on whether a 
HOLD request is being acknowledged. 
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Table 3.4. Output Pin Status during Reset 



Pin Name 


Pin Value 


HOLD 

Not 

Acknowledged 


HOLD 
Acknowledged 


ADS#, LOCK# 


HIGH 


Tri-State OFF 


W/R#,PTB 


LOW 


Tri-State OFF 


BREQ 


LOW 


LOW 


HLDA 


LOW 


HIGH 


D63-D0 


Tri-State OFF 


Tri-State OFF 


A31 -A3, 

BE7#-BE0#, 

NENE# 


Undefined 


Tri-State OFF 



After a reset, the i860 XR microprocessor begins ex- 
ecuting at physical address OxFFFFFFOO. The pro- 
gram-visible state of the i860 XR microprocessor af- 
ter reset is detailed in section 2.8.7. 

Eight-bit code-size mode is selected when INT/CS8 
is asserted during the clock before the falling edge 
of RESET. While in eight-bit code-size mode, in- 
struction cache misses are byte reads (transferred 
on D7-D0 of the data bus) instead of eight-byte 
reads. This allows the i860 XR microprocessor to be 
bootstrapped from an eight-bit EPROM. For these 
code reads, byte enables BE2#-BE0# are rede- 
fined to be the low order three bits of the address, 
so that a complete byte address is available. These 
reads update the instruction cache if KEN# is as- 
serted (refer to section 3.1.14) and are not pipelined 
even if NA# is asserted. While in this mode, instruc- 
tions must reside in an eight-bit wide memory, while 
data must reside in a separate 64-bit wide memory. 
After the code has been loaded into 64-bit memory, 
initialization code can initiate 64-bit code fetches by 
clearing the CS8 bit of the dirbase register (refer to 
section 2). Once eight-bit code-size mode is dis- 
abled by software, it cannot be reenabled except by 
resetting the i860 XR microprocessor. 



3.3 Testability 

The i860 XR microprocessor has a boundary scan 
mode that may be used in component- or board-lev- 
el testing to test the signal traces leading to and 
from the i860 XR microprocessor. Boundary scan 
mode provides a simple serial interface that makes it 
possible to test all signal traces with only a few 
probes. Probes need be connected only to CLK, 
BSCN, SCAN, SHI, BREQ, RESET, and HOLD. 

The pins BSCN and SCAN control the boundary 
scan mode (refer to Table 3.5). When BSCN is as- 



serted, the i860 XR microprocessor enters boundary 
scan mode on the next rising clock edge. Boundary 
scan mode can be activated even while RESET is 
active. When BSCN is deasserted while in boundary 
scan mode, the i860 XR microprocessor leaves 
boundary scan mode on the next rising clock edge. 
After leaving boundary scan mode, the internal state 
is undefined; therefore, RESET should be asserted. 





Table 3.5. Test Mode Selection 


BSCN 


SCAN 


Testability Mode 


LO 


LO 


No testability mode selected 


LO 


HI 


(Reserved for Intel) 


HI 


LO 


Boundary scan mode, normal 


HI 


HI 


Boundary scan mode, shift 
SHI as input; BREQ as 
output 



For testing purposes, each signal pin has associated 
with it an internal latch. Table 3.6 indentifies these 
latches by name and classifies them as input, out- 
put, or control. The input and output latches carry 
the name of the corresponding pins. 

Table 3.6. Test Mode Latches 



Input 
Latch 


Output 
Latch 


Associated 

Control 

Latch 


SHI 






BSCN 






SCAN 






RESET 






D0-D63 


D0-D63 


DATAt 


CC1-CC0 








A31-A3 


ADDRt 




NENE# 


NENEt 




PTB# 


PTBt 




W/R# 


W/Rt 




ADS# 


ADSt 




HLDA 






LOCK# 


LOCKt 


READY# 






KEN# 






NA# 






INT/CS8 






HOLD 








BE7#-BE0# 


BEt 




BREQ 





Within boundary scan mode the i860 XR microproc- 
essor operates in one of two submodes: normal 
mode or shift mode, depending on the value of the 
SCAN input. A typical test sequence is . . . 
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1 . Enter shift mode to assign values to the latches 
that correspond with the pins. 

2. Enter normal mode. In normal mode the i860 XR 
microprocessor transfers the latched values to 
the output pins and latches the values that are 
being driven onto the input pins. 

3. Reenter shift mode to read the new values of the 
input pins. 

3.3.1 NORMAL MODE 

When SCAN is deasserted, the normal mode is se- 
lected. For each input pin (RESET, HOLD, 
INT/CS8, NA#, READY#, KEN#, SHI, BSCN, 
SCAN, CC1, and CCO), the corresponding latch is 
loaded with the value that is being driven onto the 
pin. 

The tristate output pins (A31-A3, BE7#-BE0#, 
W/R#, NENE#, ADS#, LOCK#, and PTB) are en- 
abled by the control latches ADDRt (for A31-A3), 
BEt, W/Rt, NENEt, ADSt, LOCKt, and PTBt. If a con- 
trol latch is set, the corresponding output latches 
drive their output pins; otherwise the pins are not 
driven. 

The I/O pins (D63-D0) are enabled by the control 
latch DATAt, which is similar to the other control 
latches. In addition, when DATAt is not set, the data 
pins are treated as input pins and their values are 
latched. 



A tester causes entry into this mode for one of two 
purposes: 

1 . To assign values to output latches to be driven 
onto output pins upon subsequent entry into nor- 
mal mode. 

2. To read the values of input pins previously latched 
in normal mode. 



4.0 BUS OPERATION 

A bus cycle begins when ADS# is activated and 
ends when READY # is sampled active. READY # is 
sampled one clock after assertion of ADS# and 
thereafter until it becomes active. New cycles can 
start as often as every other clock until three cycles 
are outstanding. A bus cycle is considered outstand- 
ing as long as READY # has not been asserted to 
terminate that cycle. After READY # becomes ac- 
tive, it is not sampled again for the following (out- 
standing) cycle until the second clock after the one 
during which it became active. READY # is assumed 
to be inactive when it is not sampled. 

With regard to how a bus cycle is generated by the 
i860 XR microprocessor, there are two types of cy- 
cles: pipelined and nonpipelined. Both types of cy- 
cles can be either read or write cycles. A pipelined 
cycle is one that starts while one or two other bus 
cycles are outstanding. A nonpipelined cycle is one 
that starts when no other bus cycles are outstand- 
ing. 




3.3.2 SHIFT MODE 

When SCAN is asserted, the shift mode is selected. 
In shift mode, the pins are organized into a boundary 
scan chain. The scan chain is configured as a shift 
register that is shifted on the rising edge of CLK. The 
SHI pin is connected to the input of one end of the 
boundary scan chain. The value of the most signifi- 
cant bit of the scan chain is output on the BREQ pin. 
To avoid glitches while the values are being shifted 
along the chain, the tester should assert both the 
RESET and HOLD pins. Then all tristate outputs are 
disabled. The order of the pins within the chain is 
shown in Figure 3.1. 



4.1 Pipelining 

A m-n read or write cycle is a cycle with a total cycle 
time of m clocks and a cycle-to-cycle time of n 
clocks (m > n). Total cycle time extends from the 
clock in which ADS# is activated to the clock in 
which READY # becomes active, whereas cycle-to- 
cycle time extends from the time that READY # is 
sampled active for the previous cycle to the time 
that it is sampled active again for the current cycle. 
When m = n, a nonpipelined cycle is implied; m > n 
implies a pipelined cycle. 



_+ 


1 
SHI 


_+ 


2 
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_► 


3 
SCAN 


__* 


4 
RESET 


_* 
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DATAt 


_> 


6 
DO 
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_> 


69 
D63 


_» 
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— > 


71 
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-> 


72 
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-» 




-+ 
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A3 


-> 
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-> 
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— 
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— 


106 
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-> 
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-> 


108 
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^ 
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- 
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- 
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"> 
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- 
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— > 
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Figure 3.1. Order of Boundary Scan Chain 
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Pipelining may occur for the next bus cycle any time 
the current bus cycle requires more than two clock 
periods to finish (m > 2). If a bus request is pending, 
the next cycle will be initiated when NA# is sampled 
active, even if the current cycle has not terminated. 
In this case, pipelining occurs. NA# is not recog- 
nized unitl after ADS# has become inactive. 

To allow high transfer rates in large memory sys- 
tems, two-level pipelining is supported (i.e., there 
may be up to three cycles in progress at one time). 
Pipelining enables a new word of data to be trans- 
ferred every two clocks, even though the total cycle 
time may be up to six clocks. 



4.2 Bus State Machine 

The operation of the bus is described in terms of a 
bus state machine using a state transition diagram. 
Figure 4.1 illustrates the i860 XR microprocessor 
bus state machine. A bus cycle is composed of two 
or more states. Each bus state lasts for one CLK 
period. 

The i860 XR microprocessor supports up to two lev- 
els of address pipelining. Once it has started the first 
bus cycle, it can generate up to two more cycles as 
long as READY # remains inactive. To start a new 
bus cycle while other cycles are still outstanding, 
NA# must be active for at least one clock cycle 
starting with the clock after the previous ADS#. 
NA# is latched internally. 

States Tj and Tj k , for j = { 1 ,2,3 ) and k = { 1 ,2 1 , are 
used to describe the state of the i860 XR microproc- 
essor Bus State Machine. Index j indicates the num- 
ber of outstanding bus cycles while index k distin- 
guishes the intermediate states for the j-th outstand- 
ing cycle. Therefore there can be up to three out- 



standing cycles, and there are two possible interme- 
diate states for each level of pipelining. Tji is the 
next state after Tj, as long as j cycles are outstand- 
ing. Tj2 is entered when NA# is active but the i860 
XR microprocessor is not ready to start a new cycle. 

Five conditions have to be met to start a new cycle 
while one or more cycles are already pending: 

1. READY # inactive 

2. NA# having been active 

3. An internal request pending (BREQ active) 

4. HOLD not active 

5. Fewer than three cycles outstanding 

Note that BREQ is asserted on the clock after the 
i860 XR microprocessor realizes an internal request 
for the bus. 

Upon hardware RESET, the bus control logic enters 
the idle state T| and awaits an internal request for a 
bus cycle. If a bus cycle is requested while there is 
no hold request from the system, a bus cycle begins, 
advancing to state T-|. On the next cycle, the state 
machine automatically advances to state T-n. If 
READY # is active in state T-n, the bus control logic 
returns either to T|, if no new cycle is started, or to 
T-|, if a new cycle request is pending internally. In 
fact, if an internal bus request is pending each time 
READY# is active, the state machine continues to 
cycle between T-| 1 and Ti . 

However, if READY # is not active but the next ad- 
dress request is pending (as indicated by an active 
NA#), the state machine advances either to state 
T2 (if an internal bus request is pending, signifying 
that two bus cycles are now outstanding), or to state 
T12 (if no bus internal request is pending, signifying 
NA# has been found active). Transitions from state 
T-| 2 are similar to those from T1 -j. 
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READY# DEASSERTED 



READY# DEASSERTED- 
NA# DEASSERTED 



READY# DEASSERTED- 
(NO REQUEST* 
HOLD ASSERTED) 



READY# DEASSERTED' 
(NO REQUEST 
HOLD ASSERTED) 




Once READY* has been sampled active, it is 
not sampled again until two clocks later 
Not sampled during ADS# active clock 
Active in T-| , T2 and T3 
Active in Th 

HOLD in this figure is the internally synchro- 
nized version of the external signal HOLD 
Internal Bus Request Pending (BREQ assert- 
ed) 



HOLD ASSERTED 



Figure 4.1. Bus State Machine 



If two bus cycles are already outstanding (as indicat- 
ed by Tak for k = (1,2}) and NA# is latched active 
but READY# is not active, one more bus request 
causes entry into state T3. Transitions from this 
state are similar to those from T2. 

In general, if there is an internal bus request each 
time both READY # and NA# are active, the state 



machine continues to oscillate between Th and Ti, 
forj = {2,3}. 

When NA# is sampled active while there is a pend- 
ing bus request, ADS# is activated in the next clock 
period (provided no more than two cycles are al- 
ready outstanding). 
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Internal pending bus requests start new bus cycles 
only if no HOLD request has been recognized. Th is 
entered from the idle state Tj, T-j -j, and Ti2- HLDA is 
active in this state. There is a one clock delay to 
synchronize the HOLD input when the signal meets 
the respective minimum setup and hold time require- 
ments. The state machine uses the synchronized 
HOLD to move from state to state. 



4.3 Bus Cycles 

Figures 4.2 through 4.10 illustrate combinations of 
bus cycles. 



4.3.1 NONPIPELINED READ CYCLES 

A read cycle begins with the clock in which ADS# is 
asserted. The i860 XR microprocessor begins driv- 
ing the address during this clock. It samples 
READY# for active state every clock after the first 
clock. A minimum of two clocks is required per cycle. 
Data is latched when READY # is found active when 
sampled at the end of a clock period. Figure 4.2 il- 
lustrates nonpipelined read cycles with zero wait 
states. 
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Figure 4.2. Fastest Read Cycles 
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Figure 4.3. Fastest Write Cycles 



4.3.2 NONPIPELINED WRITE CYCLES 

The ADS# and READY # activity for write cycles 
follows the same logic as that for read cycles, as 
Figure 4.3 illustrates for back-to-back, nonpipelined 
write cycles with zero wait-states. 

The fastest write cycle takes only two clocks to com- 
plete. However, when a read cycle immediately pre- 
cedes a write cycle, the write cycle must contain a 



wait state, as illustrated in Figure 4.4. Because the 
device being read might still be driving the data bus 
during the first clock of the write cycle, there is a 
potential for bus contention. To help avoid such con- 
tention, the i860 XR microprocessor does not drive 
the data bus until the second clock of the write cy- 
cle. The wait state is required to provide the addi- 
tional time necessary to terminate the write cycle. In 
other read-write combinations, the i860 XR micro- 
processor does not require a wait state. 
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Figure 4.4. Fastest Read/ Write Cycles 
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Figure 4.5. Pipelined Read Followed by Pipelined Write 
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Figure 4.6. Pipelined Write Followed by Pipelined Read 



4.3.3 PIPELINED READ AND WRITE CYCLES 

Figures 4.5 and 4.6 illustrate combinations of non- 
pipelined and pipelined read and write cycles. The 
following description applies to both diagrams. While 
Cycle 1 is still in progress, two new cycles are initiat- 
ed. By the time READY # first becomes active, the 
state machine has moved through states T-|, Tn, 
T 2. T21, and T3. Cycles 3 and 4 show how activating 
READY # terminates the corresponding outstanding 
cycle, and yet activating NA# while there is an inter- 
nal request pending adds a new outstanding cycle. 

In Figure 4.5, Cycle 3 is a write cycle following a read 
cycle; therefore, one wait state must be inserted. 
The i860 XR microprocessor does not drive the data 
bus until one clock after the read data is returned 
from the preceding read cycle. During Cycles 3 and 
4, the state machine oscillates between states T3 



and T31 maintaining full bus capacity (two levels of 
pipelining; three outstanding cycles). Cycles 2, 3, 
and 4 in Figure 4.6 are 5-2 cycles; i.e. each requires 
a total cycle time of five clocks while the throughput 
rate is one cycle every two clocks. 

Figure 4.7 illustrates in a more general manner how 
the NA# signal controls pipelining. Cycle 1 is a 2-2 
cycle, the fastest possible. The next cycle cannot be 
started any earlier; therefore, there is no need to 
activate NA# to start the next cycle early. Cycle 2, a 
3-3 read, is different. Cycle 3 can be started during 
the third state (a wait state) of Cycle 2, and NA# is 
asserted to accomplish this. 

NA# is not activated following the ADS# clock of 
Cycle 3, thereby allowing Cycle 3 to terminate be- 
fore the start of Cycle 4. As a result, Cycle 4 is a 
nonpipelined cycle. 



2-205 



Untg! 



J860TM XR MICROPROCESSOR 



^O&O™^? 



CLK 



ADS# 




% 



A31-A3, W/R#, 

BEn#, NENE#, 

PTB 



NA# 



READY# 



D63-D0 



CYCLE 1 

NON-PIPELINED 
READ 
(2-2) 



Effi 



mm 



mm 



>--- 



nr 



CYCLE 2 

NON-PIPELINED 
READ 
(3-3) 



% 



m. 



mm 



mm 



-o 



rrr 



% 



SK 



mm 



^. 



CYCLE 3 

PIPELINED 
READ 
(3-2) 



rrr 



mm 



•-o- 



w 



CYCLE 4 

NON-PIPELINED 
READ 
(2-2) 



% 



EX 



zssx 



-o-- 



^ 



mm 



IDLE 



-O- 



IDLE 



ffiSffi 



mum 







Figure 4.7. Pipelining Driven by NA# 
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Figure 4.9. Locked Cycles 



When there is no internal bus request, activating 
NA# does not start a new cycle; the i860 XR micro- 
processor, however, remembers that NA# has been 
activated. Figure 4.8 illustrates the situation where 
NA# is active but no internal bus request is pending. 
NA# is activated when two cycles are outstanding. 
Because there is no internal request pending until 
after one idle state, no new bus cycle is started dur- 
ing that period. 

4.3.4 LOCKED CYCLES 

The LOCK# signal is asserted when the current bus 
cycle is to be locked with the next bus cycle. Asser- 
tion of LOCK# may be initiated by a program's set- 
ting the BL bit of the dirbase register using the lock 
instruction (refer to section 2) or by the i860 XR mi- 
croprocessor itself during page table updates. 

In Figure 4.9, the first read cycle is to be locked with 
the following write cycle. If there were idle states 
between the cycles, the LOCK# signal would re- 
main asserted. This is the case for a read/modify/ 
write operation. Cycle 3 is not locked because 
LOCK# is no longer asserted when Cycle 2 starts. 



4.3.5 HOLD AND BREQ ARBITRATION CYCLES 

The HOLD, HLDA, and BREQ signals permit bus ar- 
bitration between the i860 XR microprocessor and 
another bus master. 

See Figure 4.10. When HOLD is asserted, the i860 
XR microprocessor does not relinquish control of 
the bus until all outstanding cycles are completed. If 
HOLD were asserted one clock earlier, the last i860 
XR microprocessor bus cycle before HLDA would 
not be started. 

HOLD is sampled at the end of the clock in which it 
is activated. Recommended setup and hold times 
must be met to guarantee sampling one clock after 
external HOLD activation. When HOLD is sampled 
active, a one clock delay for internal synchronization 
follows. Likewise when HOLD is deasserted, there is 
a one-clock delay for internal synchronization before 
HLDA is deasserted. The outputs (except HLDA and 
BREQ) float when HLDA is asserted. 
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Figure 4.10. HOLD, HLDA, and BREQ 



If, during a HOLD cycle, an internal bus request Is 
generated, BREQ is activated even though HLDA is 
asserted. It remains active at least until the clock 
after ADS# is activated for the requested cycle. 



4.4 Bus States During RESET 

Figure 4.11 shows how INT/CS8 is sampled during 
the clock period just before the falling edge of RE- 



SET. If INT/CS8 is sampled active, the i860 XR mi- 
croprocessor enters CS8 mode. No inputs (except 
for HOLD and INT/CS8) are sampled during RESET. 

Note that, because HOLD is recognized even while 
RESET is active, the HLDA output signal may also 
become active during RESET. Refer to Table 3.4 
"Output Pin Status during Reset". 
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Figure 4.1 1. Reset Activities 
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5.0 MECHANICAL DATA 

Figures 5.1 and 5.2 show the locations of pins; Tables 5.1 and 5.2 help to locate pin identifiers. 



1 


S 


R 


Q 


P 


N 


M 


L 


K 


J 


H 


G 


F 


E 





C 


B 


A 


1 


() 

Vcc 




() 

Vcc 


() 

Vss 


() 

A12 


() 

A17 


() 

A19 


() 

A21 


() 

A23 


() 

A25 


() 

A29 


() 

A31 


() 

Vcc 


() 

Vss 


v C c 


() 

Vss 


v cc 


2 


() 

Vss 


v cc 


() 

v S s 


() 

A8 


() 

A10 


() 

A13 


() 

A15 


() 

A18 


() 

A20 


() 

A24 


() 

A27 


() 

A28 


() 

ceo 


() 

Vcc 


V S s 


() 

Vcc 


v ss 


2 


3 


() 

Vcc 




() 

A6 


() 

A7 


() 

A9 


() 

A11 


() 

A14 


() 

A16 


() 

CLK 


() 

A22 


() 

A26 


() 

A30 


() 

CC1 


() 

D62 


D60 


() 

Vss 


v cc 


3 


4 


() 

Vss 




() 

A5 
























D63 


() 

D59 


v ss 


4 


5 


() 

Vcc 




() 

A3 
























D61 


() 

D58 


D56 


5 


6 


() 

W/R0 


NENE0 


() 

PTB 
























D57 


() 

054 


052 


6 


7 


() 

ADS# 


HLDA 


() 

BREQ 
























D55 
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D53 


D50 


7 
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KEN# 
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D49 
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8 


9 


() 

INT/CS8 
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HOLD 
























D47 


() 

045 


D46 


9 


10 
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() 
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() 
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() 
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D43 


O 

D42 

() 

D41 


D44 
D40 


10 
11 


12 


(> 

SHI 


BE1# 


() 

BE0# 
























D37 


() 

D36 


D38 


12 


13 


() 

RESET 


SCAN 


() 
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D35 


() 

D34 


v cc 


.13 


1* 


() 

v S s 




() 

D1 


























() 

Vcc 


v ss 


14 


15 


() 

Vcc 




() 
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() 
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() 

D5 


() 

D7 


() 

D11 


() 

D13 


() 

017 


() 
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() 
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Vss 


v cc 
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() 
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Figure 5.1. Pin Configuration—- View from Top Side 
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A 


B 


c 


D 


E 


F 


G 


H J K 


L 


M 


N 


p 


Q 


R 


s 


1 


o 


o 

v S s 


o 

Vcc 


o 

Vss 


o 

Vcc 


o 

A31 


o 

A29 


o o o 

A25 A23 A21 


o 

A19 


o 

A17 


o 

A12 


o 

Vss 


o 

Vcc 


o 

v ss 


o 

Vcc 


2 


o 

v S s 


o 

v C c 


o 

v ss 


o 

v cc 


o 

. ceo 


o 

A28 


o 

A27 


o o o 

A24 A20 A18 


o 

A15 
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A13 


o 

A10 
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AS 
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v ss 
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v cc 
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Yss 


3 


o 

v C c 
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v S s 
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D56 
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Vss 
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D59 

o 

058 


o 

D60 

o 

063 

o 

D61 


o 

062 


o 

CC1 
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A30 
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A26 


o o o 

A22 CLK A16 


o 

A14 


o 

All 


o 

A9 
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A6 
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A5 
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A3 
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v S s 
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Vcc 
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5 










METAL LID 
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o 
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Figure 5.2. Pin Configuration— View from Pin Side 
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Table 5.1. Pin Cross Reference by Location 



Location 


Signal 


Location 


Signal 


Location 


Signal 


Location 


Signal 


A1 


V C c 


C9 


D47 


J15 


D17 


Q10 


....BE6# 


A2 


v ss 


C10 


D43 


J16 


D14 


Q11 


....BE4# 


A3. 


V C c 


C11 


D39 


J17 


D16 


Q12 


....BE0# 


A4 


v ss 


C12 


D37 


K1 


A21 


Q13 


....BSCN 


A5 


D56 


C13 


D35 


K2 


A18 


Q14 


D1 


A6 


D52 


C14 


D33 


K3 


A16 


Q15 


D2 


A7 


D50 


C15 


D32 


K15 


D13 


Q16 


v ss 


A8 


D48 


C16 


v ss 


K16 


D15 


Q17 


Vcc 


A9 


D46 


C17 


Vcc 


K17 


D12 


R1 


v ss 


A10 


D44 


D1 


.....v ss 


L1 


A19 


R2 


Vcc 


A11 


D40 


D2 


.....Vcc 


L2 


A15 


R3 


......v ss 


A12 


D38 


D3 


D62 


L3 


A14 


R4 


Vcc 


A13 


v cc 


D15 


D31 


L15 


D11 


R5 


A4 


A14 


v ss 


D16 


D30 


L16 


D8 


R6 


..NENE# 


A15 


Vcc 


D17 


v ss 


L17 


.D10 


R7 


....HLDA 


A16 


v ss 


E1 


Vcc 


M1....... 


A17 


R8 


....KEN# 


A17 


Vcc 


E2 


CGO 


M2 


A13 


R9 


NA# 


B1 


v ss 


E3 


CC1 


M3 


A11 


R10...... 


....BE7# 


B2 


v cc 


E15 


D29 


M15 


D7 


R11. 


....BE2# 


B3 


v ss 


E16 


D28 


M16 


.D9 


R12 


....BE1# 


B4 


D59 


E17 


D26 


M17 


D6 


R13 


....SCAN 


B5 


D58 


F1 


A31 


N1 


A12 


R14 


DO 


B6 


D54 


F2 


.....A28 


N2 


A10 


R15 


v ss 


B7 


D53 


F3 


A30 


N3 


A9 


R16 


Vcc 


B8 


D49 


F15 


D27 


N1.5 


D5 


R17 


v ss 


B9 


D45 


F16 


.....D25 


N16 


D4 


S1 


......Vcc 


B10 


D42 


F1 7 


D24 


N17 


......Vcc 


S2 


v ss 


B11 


D41 


G1 


A29 


P1 


Vss 


S3 


Vcc 


B12 


D36 


G2 


A27 


P2 


A8 


S4 


v ss 


B13 


D34 


G3 


.....A26 


P3 


A7 


S5 


Vcc 


B14 


Vcc 


G15 


D23 


P15 


D3 


S6 


...W/R# 


B15 


.....v ss 


G16 


D22 


P16 


v cc 


S7 


....ADS# 


B16 


Vcc 


G17 


.....D20 


P17 


v ss 


S8 


..LOCK# 


B17 


.....v ss 


HI 


.....A25 


QT 


v cc 


S9 


.INT/CS8 


C1 


Vcc 


H2 


A24 


Q2 


V SS 


S10 


....BE5# 


C2 


v ss 


H3 


A22 


Q3 


A6 


S11 


....BE3# 


C3 


D60 


H15 


.....D21 


Q4 


A5 


S12 


.SHI 


C4 


.....D63 


H16 


D19 


Q5 


A3 


S13 


...RESET 


C5 


D61 


H17 


D18 


Q6 


PTB 


S14 


......Vss 


C6 


D57 


J1 


.....A23 


Q7 


....BREQ 


S15 


Vcc 


C7 


.....D55 


J2 


A20 


Q8 


.READY# 


S16 


v ss 


C8 


.... .D51 


J3 


CLK 


Q9 


....HOLD 


S17 


Vcc 
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Table 5.2. Pin Cross Reference by Pin Name 



Signal Location 

A3.... Q5 

A4...... R5 

A5 ...Q4 

A6... Q3 

A7 .P3 

A8 .P2 

A9....... N3 

A10 N2 

A11... M3 

A12 N1 

A13 M2 

A14 L3 

A15 L2 

A16 K3 

A17 M1 

A18 K2 

A19 L1 

A20 J2 

A21 K1 

A22. ...H3 

A23 ..J1 

A24 H2 

A25 ..H1 

A26 ..G3 

A27 .G2 

A28 F2 

A29 ....G1 

A30 .F3 

A31 F1 

ADS# S7 

BE0#..........Q12 

BE1# R12 

BE2#.... R11 

BE3# S11 

BE4#... .011 

BE5# .S10 

BE6# .Q10 

BE7#.... R10 

BREQ. Q7 

BSCN Q13 

CCO E2 

CC1 ...E3 



Signal 



Location 



Signal 



Location 



Signal Location 

V CC .. ..B16 

V C c ...ci 

V CC C17 

V C c D2 

V C C E1 

Vcc ...N17 

V C c P16 

Vcc Q1 

V C c ,.Q17 

v C c R2 

Vcc R4 

Vcc ....R16 

Vcc S1 

Vcc S3 

V CC . S5 

V CC S15 

V C c S17 

V SS A2 

V SS A4 

V SS A14 

V SS A16 

V SS B1 

V SS B3 

V SS B15 

V S s B1.7 

V SS C2 

V SS ... C16 

VsS D1 

V SS D17 

V SS P1 

V SS P17 

V SS • Q2 

V SS Q16 

v ss R1 

V S S .....R3 

v ss R15 

v ss R17 

V SS S2 

V SS S4 

V SS S14 

Vss •■■.... S16 

W/R# S6 



CLK. 

DO.. 

Dt.. 

D2.. 

D3-.. 

D4.. 

D5.. 

D6 . 

D7 . 

D8 . . 

D9 . 

D10. 

D11 . 

D12. 

D13. 

D14. 

D15. 

D16. 

D17. 

D18. 

D19. 

D20 

D21. 

D22 

D23 

D24. 

D25. 

D26. 

D27. 

D28. 

D29. 

D30. 

D31. 

D32. 

D33. 

D34. 

D35. 

D36. 

D37. 

D38. 

D39. 

D40. 



...J3 
.R14 
.Q14 
.Q15 
.P15 
.N16 
.N15 
.M17 
.M15 
.L16 
.M16 
• L17 
.L15 
.K17 
.K15 
.J16 
.K16 
,.J17 
.J15 
.H17 
.H16 
.G17 
.H15 
.G16 
.G15 
.F17 
.F16 
.E17 
.F15 
.E16 
.E15 
,D16 
.D15 
.C15 
.G14 
.B13 
.C13 
.B12 
.C12 
.A12 
.C11 
.A11 



D41 

D42 

D43.... 

D44 

D45 

D46.. 

D47 

D48 

D49 

D50 

D51 

D52 

D53 

D54 

D55 

D56 

D57 

D58........ 

D59 

D60 ...... . 

D61 

D62 

D63....... 

HLDA R7 

HOLD Q9 

INT/CS8 ........ S9 

KEN# R8 

LOCK# S8 

NA#............R9 

NENE# ....B6 

PTB .06 

READY#........Q8 

RESET.. S13 

SCAN... R13 

SHI S12 

Vcc A1 

V CC -A3 

Vcc A13 

Vcc A15 

V CC ....A17 

Vcc B2 

V C c B14 



.B11 
.B10 
.C10 
.A10 
..B9 
..A9 
..C9 
..A8 
..B8 
..A7 
..C8 
..A6 
..B7 
..B6 
..C7 
..A5 
..C6 
..B5 
..B4 
..C3 
..C5 
..D3 
..C4 
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Table 5.3. Ceramic PGA Package Dimension Symbols 


Letter or 
Symbol 


Description of Dimensions 


A 


Distance from seating plane to highest point of body 


Ai 


Distance between seating plane and base plane (lid) 


A 2 


Distance from base plane to highest point of body 


A3 


Distance from seating plane to bottom of body 


B 


Diameter of terminal lead pin 


D 


Largest overall package dimension of length 


D1 


A body length dimension, outer lead center to outer lead center 


ei 


Linear spacing between true lead position centerlines 


L 


Distance from seating plane to end of lead 


Si 


Other body dimension, outer lead center to edge of body 




NOTES: 

1. Controlling dimension: millimeter. 

2. Dimension "e-|" ("e") is non-cumulative. 

3. Seating plane (standoff) is defined by P.C. board hole size: 0.0415-0.0430 inch. 

4. Dimensions "B", "B-|" and "C" are nominal. 

5. Details of Pin 1 identifier are optional. 
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Family: Ceramic Pin Grid Array Package 


Symbol 


Millimeters 


Inches 


Min 


Max 


Notes 


Min 


Max 


Notes 


A 


3.56 


4.57 




0.140 


0.180 




Ai 


0.64 


1.14 


SOLID LID 


0.025 


0.045 


SOLID LID 


A 2 


2.79 


3.56 


SOLID LID 


0.110 


0.140 


SOLID LID 


A 3 


1.14 


1.40 




0.045 


0.055 




B 


0.43 


0.51 




0.017 


0.020 




D 


44.07 


44.83 




1.735 


1.765 




Di 


40.51 


40.77 




1.595 


1.605 




ei 


2.29 


2.79 




0.090 


0.110 




L 


2.54 


3.30 




0.100 


0.130 




N 


168 


# of Pins 


168 


# of Pins 


Si 


1.52 


2.54 




0.060 


0.100 




ISSUE 


IWS REVX 7/15/88 





Figure 5.3. 168 Lead Ceramic PGA Package Dimensions 



6.0 PACKAGE THERMAL 
SPECIFICATIONS 

For this section, let: 

P = maximum power consumption 

Tc = case temperature 

Ta = ambient air temperature 

#ca = thermal resistance from case to ambient air 

0jc = thermal resistance from junction to case 

0ja = thermal resistance from junction to ambient 
air 



The i860 XR microprocessor is specified for opera- 
tion when Tc is within the range of 0°C-85°C. Tc 
may be measured in any environment to determine 
whether the i860 XR microprocessor is within speci- 
fied operating range. The case temperature should 
be measured at the center of the top surface oppo- 
site the pins. 

Ta can be calculated from 0ca (thermal resistance 
from case to ambient) with the following equation: 

t a = T c -p*0ca 
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Typical values for 0ca an d 0jc at various airflows 
are given in Table 6.1 for the 1.75 sq. in., 168 pin, 
ceramic PGA. 0jc is also shown so that 0ja can be 
calculated by: 

Note that 0jc with a heatsink differs from 0jc with- 
out a heatsink because case temperature is mea- 
sured differently. Case temperature for 0jc with 
heatsink is measured at the center of the heat fin 
base. Case temperature for 0jc without heatsink is 
measured at the center of package top surface. 



Table 6.2 shows the maximum Ta allowable (without 
exceeding Tc) at various airflows and operating fre- 
quencies (fcLK)- 

Note that Ta is greatly improved by attaching "fins" 
or a "heat sink" to the package. P (the maximum 
power consumption) is calculated by using the maxi- 
mum Ice at 5V as tabulated in the DC Characteris- 
tics of section 7. 

Figure 6.1 gives typical Ice derating with case tem- 
perature. For more information on heat sinks, mea- 
surement techniques, or package characteristics, re- 
fer to Intel Packaging Handbook, order number 
240800. 



Typical part at 5V with maximum load 
l cc (mA) 
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Figure 6.1. Ice vs Case Temperature 
Table 6.1. Thermal Resistance (°C/W) 0jc and 0qa 





0JC 


0CA a* Airf low-ft/min 


m/sec) 





(0) 


200 
(1.01) 


400 
(2.03) 


600 
(3.04) 


800 
(4.06) 


1000 
(5.07) 


With 
Heat Sink* 


2 


11 


6 


4 


3.2 


2.5 


2.2 


Without 
Heat Sink 


1.5 


17.5 


13 


11 


9.5 


8.5 


8 



*Nine-fin, unidirectional heat sink (fin dimensions: 0.350" height, 0.040 
width, 0.115" center-to-center spacing, 1.530" length). 
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Table 6.2. Maximum Allowable Ta at Various Airflows 
ln°C 





fCLK 
(MHz) 


Airflow-ft/min (m/sec) 





(0) 


200 
(1-01) 


400 
(2.03) 


600 
(3.04) 


800 
(4.06) 


1000 
(5.07) 


Ta with 
Heat Sink* 


25.0 


57.5 


70 


75 


77 


78.8 


79.5 


33.3 


52 


67 


73 


75.5 


77.4 


78.5 


40.0 


49.3 


65.5 


72 


74.6 


76.9 


77.9 


Ta without 
Heat Sink 


25.0 


41.3 


52.5 


57.5 


61.3 


63.8 


65 


33.3 


32.5 


46 


52 


56.5 


59.5 


61 


40.0 


28.1 


42.8 


49.3 


54.1 


57.4 


59 



* Nine-fin unidirectional heat sink (fin dimensions: 0.350" height, 0.040 width, 
0.1 1 5" center-to-center spacing, 1 .530" length). 



7.0 ELECTRICAL DATA 

Inputs and outputs are TTL compatible, except for 
CLK. All input and output timings are specified rela- 
tive to the 1.5 volt level of the rising edge of CLK 
and refer to the point that the signals reach 1 .5V. 

7.1 Absolute Maximum Ratings 

Case Temperature Tc under Bias .... . .0°C to 85°C 

Storage Temperature -65°Cto +150°C 

Voltage on Any Pin 

with Respect to Ground ...,...- 0.5 to 6.5V 



NOTICE: This data sheet contains preliminary infor- 
mation on new products in production. The specifica- 
tions are subject to change without notice. Verify with 
your local Intel Sales office that you have the latest 
data sheet before finalizing a design. 



* WARNING: Stressing the device beyond the "Absolute 
Maximum Ratings" may cause permanent damage. 
These are stress ratings only. Operation beyond the 
"Operating Conditions" is not recommended and ex- 
tended exposure beyond the "Operating Conditions" 
may affect device reliability. 



7.2 D.C. Characteristics 





Table 7.1 


. DC Characteristics 








T c = 0°C to 85°C, V C c 


= 5V ±5% 






Symbol 


Parameter 


Min 


Max 


Units 


Notes 


V|L 


Input LOW Voltage 


-0.3 


+ 0.8 


V 




V| H 


Input HIGH Voltage 


2.0 


Vcc + 0.3 


V 




V, LC 


CLK Input LOW Voltage 


-0.3 


+ 0.8 


V 




V|HC 


CLK Input HIGH Voltage 


3.0 


V CC + 0.3 


V 




Vol 


Output LOW Voltage 




0.45 


V 


(Note 1) 


Voh 


Output HIGH Voltage 


2.4 




V 


(Note 2) 


'cc 


Power Supply Current 












CLK = 25.0 MHz 




500 


mA 


V C c @5V 




CLK = 33.3 MHz 




600 


mA 


V C c @5V 




CLK = 40.0 MHz 




650 


mA 


V C c @5V 


Ili 


Input Leakage Current 




±15 


juA 


Nopullup 
or pulldown 


Ilo 


Output Leakage Current 




±15 


jllA 




C|N 


Input Capacitance 




15 


PF 


(Note 3) 


Co 


I/O or Output Capacitance 




15 


PF 


(Note 3) 


CdK 


Clock Capacitance 




20 


PF 


(Note 3) 



NOTES: 

1. This parameter is measured at 4.0 mA for A31-A3, D63-D0, BE7#-BE0#; at 5.0 mA for all other outputs. 

2. This parameter is measured at 1.0 mA for A31-A3, D63-D0, BE7#-BE0#; at 0.9 mA all other outputs. 

3. These are not tested. They are guaranteed by design characterization. 
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7.3 A.C. Characteristics 



Table 7.2. A.C. Characteristics 

T c = 0°C to 85°C, V C c = 5V +5% 
All timings measured at CLK = 1.5V unless otherwise specified. 



Symbol 


Parameter 


25 MHz 


33 MHz 


40 MHz 


Notes 


Min 
(ns) 


Max 
(ns) 


Min 
(ns) 


Max 
(ns) 


Min 
(ns) 


Max 
(ns) 


t1 


CLK Period 


40 


125 


30 


125 


25 


125 




t2 


CLK High Time 


6 




5 




3 




.at3V 


t3 


CLK Low Time 


8 




7 




5 




at 0.8V 


t4 


CLK Fall Time 




7 




7 




7 


3V-0.8V 


t5 


CLK Rise Time 




7 




7 




7 


0.8V-3V 


t6a 


A31 -A3, PTB, W/R#, NENE# 
Valid Delay 


3.5 


25 


3.5 


23 


3.5 


19 


50 pF Load 


t6b 


BEn#* Valid Delay 


3.5 


27 


3.5 


25 


3.5 


21 


50 pF Load 


t7 


Float Time, All 


3.5 


40 


3.5 


30 


3.5 


25 


(Notel) 


t8 


ADS#, BREQ, LOCK#, HLDA 
Valid Delay 


3.5 


22 


3.5 


20 


3.5 


15 


50 pF Load 


t9 


D63-D0 Valid Delay 


3.5 


38 


3.5 


35 


3.5 


31 


50 pF Load 


t10 


Setup Time, All Inputs 


13 




11 




8 




(Note 2) 


t11a 


Hold Time, All Inputs except 
DATA 


4 




4 




3 




(Note 2) 


t11b 


DATA Hold Time 


5 




4 




3 








NOTES: 

1 . Float condition occurs when maximum output current becomes less than l|_o in magnitude. Float delay is not tested. 

2. INT and HOLD are asynchronous inputs. The setup and hold specifications are given for test purposes or to assure 
recognition on a specific rising edge of CLK. 

* n = 0, 1, ...,7 
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Figure 7.1. CLK, Input, and Output Timings 
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TYPICAL* OUTPUT 

DELAY (ns) nom +5 
@ 1.5V 




NOTES: 

Graphs are not linear outside the C|_ range shown, 
nom = nominal value given in the AC timing table. 
Typical part under worst-case conditions. 



25 50 75 100 125 

LOAD CAPACITANCE, C L (pf) 



240296-26 



Figure 7.2. Typical Output Delay vs Load Capacitance under Worst-Case Conditions 



TYPICAL* OUTPUT 
SLEW TIME (ns) 9 
(0.8 -2.0V) 

6 




A )S#, BREQ, L0CK#, HLDA 



V /R#, NENE# 



NOTES: 



25 50 75 100 125 150 
LOAD CAPACITANCE, C L (pf) 



Graphs are not linear outside the C|_ range shown. 
Typical part under worst-case conditions. 



Figure 7.3. Typical Slew Time vs Load Capacitance under Worst-Case Conditions 



700 
600 
~ 500 
_8 400 
300 
200 

NOTES: 

Graphs are not linear outside the frequency range 
*Worst-case supply current at 5V. 
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Figure 7.4. Typical Ice vs Frequency 
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8.0 INSTRUCTION SET 

Key to abbreviations: 

For register operands, the abbreviations that describe the operands are composed of two parts. The first part 
describes the type of register: 

c One of the control registers fir, psr, epsr, dirbase, db, or fsr 

f One of the floating-point registers: fO through f31 

/ One of the integer registers: rO through r31 

The second part identifies the field of the machine instruction into which the operand is to be placed: 

srd The first of the two source-register designators, which may be either a register or a 16-bit 

immediate constant or address offset. The immediate value is zero-extended for logical 
operations and is sign-extended for add and subtract operations (including addu and subu) 
and for all addressing calculations. 

Same as srd except that no immediate constant or address offset value is permitted. 



srdni 
srrfs 

src2 
dest 



Same as srd except that the immediate constant is a 5-bit value that is zero-extended to 32 
bits. 

The second of the two source-register designators. 

The destination register designator. 



Thus, the operand specifier isrc2, for example, means that an integer register is used and that the encoding of 
that register must be placed in the src2 field of the machine instruction. 

Other (nonregister) operands are specified by a one-part abbreviation that represents both the type of operand 

required and the instruction field into which the value of the operand is placed: 

# const A 16-bit immediate constant or address offset that the i860 XR microprocessor sign-extends 

to 32 bits when computing the effective address. 

Ibroff A signed, 26-bit, immediate, relative branch offset. 

sbroff A signed, 16-bit, immediate, relative branch offset. 

brx A function that computes the target address by shifting the offset (either Ibroff ox sbroff) left 

by two bits, sign-extending it to 32 bits, and adding the result to the current instruction pointer 
plus four. The resulting target address may lie anywhere within the address space. 

Unless otherwise specified, floating-point operations accept single- or double-precision 
source operands and produce a result of equal or greater precision. Both input operands 
must have the same precision. The source and result precision are specified by a two-letter 
suffix to the mnemonic of the operation. 

Other abbreviations include: 



•P 

.r 

.v 

.w 

.x 

y 

.z 



Precision specification .ss, .sd, or .dd (.ds not permitted). Refer to Table 8.1. 

Precision specification .ss, .sd, .ds, or .dd. Refer to Table 8.1. 

.sd or .dd. Refer to Table 8.1. 

.ss or .dd. Refer to Table 8.1. 

.b (8 bits), .s (16 bits), or .I (32 bits) 

.I (32 bits), .d (64 bits), or .q (128 bits) 

.I (32 bits), or .d (64 bits) 

Table 8.1. Precision Specification 



Suffix 


Source 
Precision 


Result 
Precision 


.ss 
.sd 
.dd 
.ds 


single 
single 
double 
double 


single 
double 
double 
single 
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mem.x(address) The contents of the memory location indicated by address with a size of x. 

PM The pixel mask, which is considered as an array of eight bits PM[7]..PM[0], where PM[0] is 

the least significant bit. 

8.1 Instruction Definitions in Alphabetical Order 

adds isrd, isrc2, idest Add Signed 

idest <— isrd 4- isrc2 
OF <— (bit 31 carry =£ bit 30 carry) 
CC set if isrc2 < -isrd (signed) 
CC clear if isrc2 ^ -isrd (signed) 

addu isrd, isrc2, idest Add Unsigned 

idest <— isrd + isrc2 
OF <— bit 31 carry 
CC <— bit 31 carry 

and isrd, isrc2, idest Logical AND 

idest <r— isrd and isrc2 

CC set if result is zero, cleared otherwise 

andh #const, isrc2, idest Logical AND High 

idest <— (# const shifted left 16 bits) and isrc2 
CC set if result is zero, cleared otherwise 

andnot isrd, isrc2, idest Logical AND NOT 

idest <— not isrd and isrc2 

CC set if result is zero, cleared otherwise 

andnoth #const, isrc2, idest Logical AND NOT High 

idest <r- not (# const shifted left 16 bits) and isrc2 
CC set if result is zero, cleared otherwise 

be Ibroff Branch on CC 

IF CC = 1 

THEN continue execution at brx(lbroff) 

Fl 

bet Ibroff Branch on CC, Taken 

IF CC = 1 

THEN execute one more sequential instruction 

continue execution at brx(lbroff) 
ELSE skip next sequential instruction 
Fl 

bla isrdni, isrc2, sbroff Branch on LCC and Add 

LCC-temp clear if isrc2 < -isrc 1ni (signed) 

LCC-temp set if isrc2 ^ - isrc 1ni (signed) 
isrc2 «— isrdni + isrc2 
Execute one more sequential instruction 
IF LCC 

THEN LCC <- LCC-temp 

continue execution at brx(sbroff) 
ELSE LCC <- LCC-temp 
Fl 

bnc Ibroff Branch on Not CC 

IF CC = 

THEN continue execution at brx(lbroff) 

Fl 

bnc.t Ibroff .Branch on Not CC, Taken 

IF CC = 

THEN execute one more sequential instruction 

continue execution at brx(lbroff) 
ELSE skip next sequential instruction 
Fl 
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br Ibroff .. — — — Branch Direct Unconditionally 

Execute one more sequential Instruction. 
Continue execution at brx(lbroff). 

bri [isrdni] Branch Indirect Unconditionally 

Execute one more sequential instruction 
IF any trap bit in psr is set 

THEN copy PU to U, PIM to IM in psr 
clear trap bits 

IF DS is set and DIM is reset 

THEN enter dual-instruction mode after executing one 

instruction in single-instruction mode 
ELSE IF DS is set and DIM is set 

THEN enter single-instruction mode after executing one 

instruction in dual-instruction mode 
ELSE IF DIM is set 

THEN enter dual-instruction mode 
for next two instructions 
ELSE enter single-instruction mode 

for next two instructions 
Fl 
Fl 
Fl 
Fl ■ 

Continue execution at address in isrdni 
(The original contents of isrdni is used even if the next instruction 
modifies isrdni. Does not trap if isrdni is misaligned.) 

bte isrds, isrc2, sbroff . . .Branch If Equal 

IF isrds = isrc2 

THEN continue execution at brx(sbroff) 

Fl 

btne isrds, isrc2, sbroff — — Branch If Not Equal 

IF isrds # isrc2 

THEN continue execution at brx(sbroff) 

Fl 

call Ibroff .Subroutine Call 

r1 <— address of next sequential instruction + 4 (+8 in dual mode) 
Execute one more sequential instruction 
Continue execution at brx(lbroff) 

calli [isrdni] .... Indirect Subroutine Call 

r1 «— address of next sequential instruction + 4 ( + 8 in dual mode) 
Execute one more sequential instruction 
Continue execution at address in isrdni 

(The original contents of isrdni \s used even if the next instruction 

modifies isrdni. Does not trap if isrdni is misaligned. 

The register isrdni must not be r1.) 

fadd.p fsrc 1, fsrc2, fdest Floating-Point Add 

fdest <— fsrd + fsrc2 

faddp fsrd, fsrc2, fdest Add with Pixel Merge 

fdest *— fsrd + fsrc2 

Shift and load MERGE register as defined in Table 8.2 

faddz fsrd, fsrc2, fdest — . . .Add with Z Merge 

fdest <— fsrd + fsrc2 

Shift MERGE right 16 and load fields 31.. 16 and 63..48 

famov.r fsrd, fdest Floating-Point Adder Move 

fdest <— fsrd 

Send fsrd through the floating-point adder. (Preserves -0 (minus zero) when fsrd is -0. fsrc2 

must be coded as fO by the assembler.) 
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fiadd.w fsrd, fsrc2, fdest Long-Integer Add 

fdest <— fsrd + fsrc2 

fisub.w fsrd, fsrc2, fdest ....... Long-Integer Subtract 

fdest <— fsrd - fsrc2 

f ix.v fsrd, fdest Floating-Point to Integer Conversion 

fdest <— 64- bit value with low-order 32 bits equal to integer part of fsrd rounded 

Floating-Point Load 

fld.y isrd(isrc2), fdest , .■ — — (Normal) 

fld.y isrd(isrc2)+ + , fdest (Autoincrement) 

fdest <— mem.y (isrd + isrc2) 

IF autoincrement 

THEN isrc2 <— isrd + isrc2 

Fl 

Cache Flush 

flush # const(/src2) .(Normal) 

flush # const(/src2) + + (Autoincrement) 

Replace block in data cache with address (# const ■+■ isrc2). 

Contents of block undefined. 

IF autoincrement 

THEN isrc2 <— # const + isrc2 

Fl 

fmlow.dd fsrd, fsrc2, fdest Floating-Point Multiply Low 

fdest <— low-order 53 bits of fsrd mantissa x fsrc2 mantissa 
fdest bit 53 <— most significant bit of mantissa 

f mov.r fsrd, fdest Floating-Point Reg-Reg Move 

Assembler pseudo-operation 

fmov.ss fsrd, fdest = fiadd.ss fsrd, fO, fdest 
fmov.dd fsrd, fdest = fiadd.dd fsrd, fO, fdest 
fmov.sd fsrd, fdest = famov.sd fsrd, fdest 
fmov.ds fsrd, fdest — famov.ds fsrd, fdest 

fmul.p fsrd, fsrc2, fdest Floating-Point Multiply 

fdest <— fsrd x fsrc2 

f nop .... Floating-Point No Operation 

Assembler pseudo-operation 
fnop = shrd rO, rO, rO 

form fsrd, fdest OR with MERGE Register 

fdest <- fsrd OR MERGE 
MERGE <- 

f rcp.p fsrc2, fdest Floating-Point Reciprocal 

fdest <— 1 1fsrc2 with maximum mantissa error < 2- 7 

f rsqr.p fsrc2, fdest Floating-Point Reciprocal Square Root 

fdest <— 1 /SORT (fsrc2) with maximum mantissa error < 2" 7 

Floating-Point Store 

fst.y fdest, isrd{isrc2) (Normal) 

fst.y fdest, isrd(isrc2) + + (Autoincrement) 

mem.y (isrc2 + isrd) <— fdest 

IF autoincrement 

THEN isrc2 <— isrd + isrc2 

Fl 
fsub.p fsrd, fsrc2, fdest Floating-Point Subtract 

fdest <— fsrd - fsrc2 

ftrunc.v fsrd, fdest Floating-Point to Integer Conversion 

fdest <r- 64-bit value with low-order 32 bits equal to integer part of fsrd 

fxfr fsrd, idest Transfer F-P to Integer Register 

idest <— fsrd 
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fzchkl fsrd, fsrc2, fdest 32-Bit Z-Buffer Check 

Consider fsrd, fsrc2, and fdest as arrays of two 32-bit 

fields fsrd(0)..fsrd(1), fsrc2(0)..fsrc2{\) } and fdest(0)..fdest{A) 

where zero denotes the least-significant field. 
PM «- PM shifted right by 2 bits 
FORi = Oto 1 
DO 

PM [i ; + 6] <— fsrc2(\) < fsrd(\) (unsigned) 

fdest{\) <— smaller of fsrc2Q) and fsrc1(\) 
OD 
MERGE <— 

fzchks fsrd, fsrc2, fdest . . 16-Bit Z-Buffer Check 

Consider fsrd, fsrc2, and fdest as arrays of four 16-bit 

fields fsrd(0)..fsrd(3) t fsrc2(0)..fsrc2(3), and fdest(0)»fdest(3) 

where zero denotes the least-significant field. 
PM <— PM shifted right by 4 bits 
FOR i = Oto 3 
DO 

PM [i + 4] «— fsrc2(\) < fsrdQ) (unsigned) 

fdestij) <— smaller of fsrc2(\) and fsrd(i) 
OD 
MERGE «- 

intovr .... .Software Trap on Integer Overflow 

If OF in epsr = 1, generate trap with IT set in psr. 

ixfr isrdni, fdest ............;.. .Transfer Integer to F-P Register 

fdest <— isrdni 

Id.c csrc2, idest Load from Control Register 

idest <■— csrc2 

Id.x isrd{isrc2), idest -..-.... — Load Integer 

idest <— mem.x(isrd + isrc2) 

lock . . — Begin Interlocked Sequence 

Set BL in dirbase. The next load or store that misses the cache locks that location. 
Disable interrupts until the bus is unlocked. 

mov isrc2, idest . .Register-Register Move 

Assembler pseudo-operation 

mov isrc2, idest = shl rO, isrc2, idest 

mov const32, idest Constant-to-Register Move 

Assembler pseudo-operation 
adds l%const32, rO, idest 

... when const32 < 0x8000 

orh h%const32, rO, idest 
or I %const32, idest, idest 

... when const32 ^ 0x8000 

nop Core-Unit No Operation 

Assembler pseudo-operation 
nop = shl rO, rO, rO 

or isrd, isrc2, idest Logical OR 

idest «— isrd OR isrc2 

CC set if result is zero, cleared otherwise 

orh # const, isrc2, idest . . — ... Logical OR High 

idest «— (# const shifted left 1 6 bits) OR isrc2 
CC set if result is zero, cleared otherwise 
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pfadd.p fsrd, fsrc2, fdest Pipelined Floating-Point Add 

fdest <— last stage Adder result 

Advance A pipeline one stage 

A pipeline first stage <— fsrd + fsrc2 
pfaddp fsrd, fsrc2, fdest Pipelined Add with Pixel Merge 

fdest «— last stage Graphics result 

last stage Graphics result <— fsrd + fsrc2 

Shift and load MERGE register from last stage Graphics result as defined in Table 8.2 
pfaddz fsrd, fsrc2, fdest Pipelined Add with Z Merge 

fdest <r- last stage Graphics result 

last stage Graphics result <— fsrd + fsrc2 

Shift MERGE right 16 and load fields 31.. 16 and 63..48 from last stage Graphics result 
pfam.p fsrd, fsrc2, fdest Pipelined Floating-Point Add and Multiply 

fdest +— last stage Adder result 

Advance A and M pipeline one stage (operands accessed before advancing pipeline) 

A pipeline first stage «— A-op1 + A-op2 

M pipeline first stage <— M-op1 x M-op2 
pfamov.r fsrd, fdest Pipelined Floating-Point Adder Move 

fdest <r— last stage Adder result 

Advance A pipeline one stage 

A pipeline first stage «— fsrd 
pfeq.p fsrd, fsrc2, fdest Pipelined Floating-Point Equal Compare 

fdest <— last stage Adder result 

CC set if fsrd = fsrc2, else cleared 

Advance A pipeline one stage 

A pipeline first stage is undefined, but no result exception occurs 
pfgt.p fsrd, fsrc2, fdest Pipelined Floating-Point Greather-Than Compare 

(Assembler clears R-bit of instruction) 

fdest <— last stage Adder result 

CC set if fsrd > fsrc2, else cleared 

Advance A pipeline one stage 

A pipeline first stage is undefined, but no result exception occurs 
pfiadd.w fsrd, fsrc2, fdest Pipelined Long-Integer Add 

fdest <■— last stage Graphics result 

last stage Graphics result <— fsrd + fsrc2 
pfisub.w fsrd, fsrc2, fdest Pipelined Long-Integer Subtract 

fdest <— last stage Graphics result 

last stage Graphics result <— fsrd - fsrc2 
pfix.v fsrd, fdest Pipelined Floating-Point to Integer Conversion 

fdest <— last stage Adder result 

Advance A pipeline one stage 

A pipeline first stage <— 64-bit value with low-order 32 bits 
equal to integer part of fsrd rounded 

Pipelined Floating-Point Load 

pfld.z fsrd(isrc2), fdest .'.. .(Normal) 

pfld.z isrd(isrc2) 4- + , fdest (Autoincrement) 

fdest *— mem.z (third previous pfld's {fsrd + isrc2)) 
(where .z is precision of third previous pfld.z) 

If autoincrement 

THEN isrc2 <— isrd + isrc2 

Fl 
pfle.p fsrd, fsrc2, fdest Pipelined F-P Less-Than or Equal Compare 

Assembler pseudo-operation, identical to pfgt.p except that 
assembler sets R-bit of instruction. 

fdest <— last stage Adder result 

CC clear if fsrd < fsrc2, else set 

Advance A pipeline one stage 

A pipeline first stage is undefined, but no result exception occurs 
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pfmam.p fsrd, fsrc2, fdest — Pipelined Floating-Point Add and Multiply 

fdest <— last stage Multiplier result 

Advance A and M pipeline one stage (operands accessed before advancing pipeline) 

A pipeline first stage «— A-op1 - A-op2 

M pipeline first stage <— M-op1 x M-op2 

pfmov.r fsrd, fdest . . Pipelined Floating-Point Reg-Reg Move 

Assembler pseudo-operation 

pfmov.ss fsrd, fdest = pfiadd.ss fsrd, fO, fdest 
pfmov.dd fsrd, fdest = pfiadd.dd fsrd, fO, fdest 
pfmov.sd fsrd, fdest = pfamov.sd fsrd, fdest 
pfmov.ds fsrd, fdest = pfamov.ds fsrd, fdest 

pfmsm.p fsrd, fsrc2, fdest — Pipelined Floating-Point Subtract and Multiply 

fdest <r- last stage Multiplier result 

Advance A and M pipeline one stage (operands accessed before advancing pipeline) 

A pipeline first stage <— i A-op1 - A-op2 

M pipeline first stage <— M-op1 x M-op2 

pfmul.p fsrd, fsrc2, fdest Pipelined Floating-Point Multiply 

fdest <— last stage Multiplier result 

Advance M pipeline one stage 

M pipeline first stage <— fsrd x fsrc2 

pfmul3.dd fsrd, fsrc2, fdest Three-Stage Pipelined Multiply 

fdest <r— last stage Multiplier result 
Advance 3-Stage M pipeline one stage 
M pipeline first stage <— fsrd x fsrc2 

pform fsrd, fdest Pipelined OR to MERGE Register 

fdest *— last stage Graphics result 

last stage Graphics result <— fsrd OR MERGE 

MERGE <- 

pfsm.p fsrd, fsrc2, fdest Pipelined Floating-Point Subtract and Multiply 

fdest <— last stage Adder result 

Advance A and M pipeline one stage (operands accessed before advancing pipeline) 

A pipeline first stage <— A-op1 - A-op2 

M pipeline first stage <— M-op1 x M-op2 

pfsub.p fsrd, fsrc2, fdest — Pipelined Floating-Point Subtract 

fdest <— last stage Adder result 

Advance A pipeline one stage 

A pipeline first stage <— fsrd + fsrc2 

pftrunc.v fsrd, fdest Pipelined Floating-Point to Integer Conversion 

fdest <— last stage Adder result 
Advance A pipeline one stage 

A pipeline first stage <— 64-bit value with low-order 32 bits 
equal to integer part of fsrd 

pfzchkl fsrd, fsrc2, fdest Pipelined 32-Bit Z-Buffer Check 

Consider fsrd, fsrc2, and fdest, as arrays of two 32-bit 

fields fsrd(0)..fsrd(1), fsrc2(0).,fsrc2(1), and fdest(0)-fdest^) 

where zero denotes the least significant field. 
PM <— PM shifted right by 2 bits 
FORi = 0to1 
DO 

PM [i + 6] <— fsrc2(\) <, fsrdQ) (unsigned) 

fdest(\) <— last stage Graphics result 

last stage Graphics result <— smaller of fsrc2(\) and fsrd(\) 
OD 
MERGE «--0 
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pfzchks fsrd, fsrc2, fdest Pipelined 16-Bit Z-Buffer Check 

Consider fsrd, fsrc2, and fdest, as arrays of four 1 6-bit 

fields fsrd{0)..fsrd(3), fsrc2(0)..fsrc2(3), and fdest(0)-fcfest{3) 

where zero denotes the least significant field. 
PM <— PM shifted right by 4 bits 
FOR i = to 3 
DO 

PM [i + 4] «— fsrc2(\) <, fsrcHj) (unsigned) 

fdest(\) +— last stage Graphics result 

last stage Graphics result *— smaller of fsrc2ij) and fsrd(\) 
OD 
MERGE <- 

pst.d fdest, # const(isrc2) Pixel Store 

pst.d fdest, #const(isrc2) + + Pixel Store Autoincrement 

Pixels enabled by PM in mem.d (isrc2 + #consf) *— fdest 

Shift PM right by 8/ pixel size (in bytes) bits 

IF autoincrement 

THEN isrc2 <— # const + isrc2 

Fl 

shl isrd, isrc2, idest Shift Left 

idest <— isrc2 shifted left by isrd bits 

shr isrd, isrc2, idest Shift Right 

SC (in psr) <— isrd 

idest <— isrc2 shifted right by isrd bits 

shra isrd, isrc2, idest Shift Right Arithmetic 

idest <— isrc2 arithmetically shifted right by isrd bits 

shrd isrd, isrc2, idest Shift Right Double 

idest <— low-order 32 bits of isrd:isrc2 shifted right by SC bits 

st.c isrdni, csrc2 Store to Control Register 

csrc2 <— isrdni 

st.x isrdni, # const(isrc2) Store Integer 

mem.x (isrc2 + # const) <— isrdni 

subs isrd, isrc2, idest .Subtract Signed 

idest <— isrd - isrc2 
OF <— (bit 31 carry ^ bit 30 carry) 
CC set if isrc2 > isrd (signed) 
CC clear if isrc2 <, isrd (signed) 

subu isrd, isrc2, idest Subtract Unsigned 

idest <— isrd - isrc2 
OF <- NOT (bit 31 carry) 
CC <— bit 31 carry 

(i.e. CC set if isrc2 <, isrd (unsigned) 
CC clear if isrc2 > isrd (unsigned) 

trap isrdni, isrc2, idest Software Trap 

Generate trap with IT set in psr 

unlock End Interlocked Sequence 

Clear BL in dirbase. The next load or store unlocks the bus. 
Enable interrupts after bus is unlocked. 

xor isrd, isrc2, idest Logical Exclusive OR 

idest <— isrd XOR isrc2 

CC set if result is zero, cleared otherwise 

xorh #const, isrc2, idest Logical Exclusive OR High 

idest «— (# const shifted left 16 bit) XOR isrc2 
CC set if result is zero, cleared otherwise 
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Table 8.2. FADDP MERGE Update 



Table 8.3. Register Encoding 



Pixel 

Size 

(from PS) 


Fields Loaded From 
Result into MERGE 


Right Shift 

Amount 
(Field Size) 


8 
16 
32 


63..56, 47..40, 31..24, 15..8 
63..58,47..42, 31..26, 15.10 
63..56, 31. .24 


8 
6 
8 



8.2 Instruction Format and Encoding 

All instructions are 32 bits long and begin on a four- 
byte boundary. When operands are registers, the 
register encodings shown in Table 8.3 are used. 
There are two general core-instruction formats, 
REG-format and CTRL-format, as well as a separate 
format for floating-point instructions. 



8.2.1 REG-FORMAT INSTRUCTIONS 

Within the REG-format are several variations as 
shown in Figure 8.1. Table 8.4 gives the encodings 
for these instructions. One encoding is an escape 
code that defines yet another variation: the core es- 
cape instructions. Figure 8.2 shows the format of 
this group, and Table 8.5 shows the encodings. 

In these instructions, the src2 field selects one of 
the 32 integer registers (most instructions) or five 
control registers (st.c and Id.c). Dest selects one of 
the 32 integer registers (most instructions) or float- 
ing-point registers (fid, fst, pfld, pst, ixfr). For in- 
structions where srd is optionally an immediate val- 
ue, bit 26 of the opcode (l-bit) indicates whether srd 
is an immediate. If bit 26 is clear, an integer register 
is used; if bit 26 is set, srd is contained in the low- 
order 1 6 bits, except for bte and btne instructions. 
For bte and btne, the five-bit immediate value is 
contained in the srd field. For st, bte, btne, and 
bla, the upper five bits of the offset or broffset are 
contained in the dest field instead of srd, and the 
lower 1 1 bits of offset are the lower 1 1 bits of the 
instruction. 



Register 


Encoding 


rO 
r31 



31 


fO 
f31 



31 


Fault Instruction 
Processor Status 
Directory Base 
Data Breakpoint 
Floating-Point Status 
Extended Process Status 



1 
2 
3 

4 
5 



For Id and st, bits 28 and zero determine operand 
size as follows: 



Bit 28 


Bit 


Operand Size 




1 
1 



1 


1 


8-bits 

8-bits 

16-bits 

32-bits 



When srd is an immediate and bit 28 is set, bit zero 
of the immediate value is forced to zero. 

For fid, fst, pfld, pst, and flush, bit selects autoin- 
crement addressing if set. For fid, fst, pfld, and 
pst, bits one and two select the operand size as 
follows: 



Bit 1 


Bit 2 


Operand Size 





1 
1 



1 

1 


64-bits 

128-bits 

32-bits 

32-bits 



When srd is an immediate value, bits zero and one 
of the immediate value are forced to zero to main- 
tain alignment. When bit one of the immediate value 
is clear, bit two is also forced to zero. 

For flush, bits one and two must be zero. 
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31 25 




General Format 

20 15 10 






OPCODE/I 


SRC2 


DEST 


SRC1 


IMMEDIATE, OFFSET, OR NULL 






31 25 


16-Bit Immediate Variant (except bte and btne) 

20 15 






OPCODE 


1 


SRC2 


DEST 


IMMEDIATE 






31 25 




st, bla, bte, and btne 

20 15 10 






OPCODE/I 


SRC2 


OFFSET 
HIGH 


SRC1 
SRC1S 


OFFSET LOW 






31 25 




bte and btne with 5-Bit Immediate 

20 15 10 






OPCODE 


1 


SRC2 


OFFSET 
HIGH 


IMMEDIATE 


OFFSET LOW 
















Figure 8.1. REG-Format Variations 
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Table 8.4. REG-Format Opcodes 
31 



26 



Id.x 


Load Integer 











L 





I 


st.x 


Store Integer 








o 


L 


1 


1 


ixfr 


Integer to F-P Reg Transfer 














1 







(reserved) 











1 


1 





fld.x, fst.x 


Load/Store F-P 








1 





LS 


I 


flush 


Flush 








1 


1 





1 


pst.d 


Pixel Store 








1 


1 


1 


1 


Id.c, st.c 


Load/Store Control Register 








1 


1 


LS 





bri 


Branch Indirect 



















trap 


Trap 
















1 




(Escape for F-P Unit) 













1 







(Escape for Core Unit) 













1 


1 


bte, btne 


Branch Equal or Not Equal 










1 


E 


I 


pfld.y 


Pipelined F-P Load 







1 








I 




(CTRL-Format Instructions) 







1 


X 


X 


X 


addu, -s, subu, -s, 


Add/Subtract 


1 








so 


AS 


I 


shl, shr 


Logical Shift 


1 





1 





LR 


I 


shrd 


Double Shift 


1 





1 


1 








bla 


Branch LCC Set and Add 


1 





1 


1 





1 


shra 


Arithmetic Shift 


1 





1 


1 


1 


I 


and(h) 


AND 


1 


1 








H 


I 


andnot(h) 


ANDNOT 


1 


1 





1 


H 


I 


or(h) 


OR 


1 


1 


1 





H 


I 


xor(h) 


XOR 


1 


1 


1 


1 


H 


I 




(reserved) 


1 


1 


X 


X 


1 






LS 



SO 



Integer Length 

—8 bits 

1 — 16 or 32 bits (selected by bit 0) 
Load/Store 

—Load 

1 —Store 
Signed/Ordinal 

—Ordinal 

1 —Signed 
High 

— and, or, andnot, xor 

1 — andh, orh, andnoth, xorh 



AS Add/Subtract 

—Add 

1 —Subtract 
LR Left/Right 

—Left Shift 

1 —Right Shift 
E Equal 

— Branch on Not Equal 

1 —Branch on Equal 
I Immediate 

— srd is register 

1 — srd is immediate 





31 26 


15 


10 




5 






10 11 


reserved* 


SRC1 


reserved* 


OPCODE 






Reserved (must be set to zero by assemblers) 













Figure 8.2. Core Escape Instruction Format 
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Table 8.5. Core Escape Opcodes 

4 





(reserved) 

















lock 


Begin Interlocked Sequence 














1 


calli 


Indirect Subroutine Call 











1 







(reserved) 











1 


1 


intovr 


Trap on Integer Overflow 








1 










(reserved) 








1 





1 




(reserved) 








1 


1 





unlock 


End Interlocked Sequence 








1 


1 


1 




(reserved) 





1 


X 


X 


X 




(reserved) 


1 





X 


X 


X 




(reserved) 


1 


1 


X 


X 


X 



8.2.2 CTRL-FORMAT INSTRUCTIONS 

The CTRL instructions do not refer to registers, so instead of the register fields, they have a 26-bit relative 
branch offset. Figure 8.3 shows the format of these instructions and Table 8.6 defines the encodings. 






31 28 25 














1 


1 


OPC 


BROFFSET 






BROFFSET is a signed 26-bit relative branch offset. 









Figure 8.3. CTRL Instruction Format 



Table 8.6. CTRL-Format Opcodes 

28 26 





(reserved) 













(reserved) 








1 


br 


Branch Direct 





1 





call 


Call 





1 


1 


bc(.t) 


Branch on CC Set 


1 





T 


bnc(.t) 


Branch on CC Clear 


1 


1 


T 



T Taken 

—be or bnc 

1 — bet or bnc.t 
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8.2.3 FLOATING-POINT INSTRUCTIONS 

The floating-point instructions also constitute an escape series. All these instructions begin with the bit se- 
quence 010010. Figure 8.4 shows the format of the floating point instructions, and Table 8.7 gives the encod- 
ings. Within the dual-operation instructions is a subcode DPC whose values are given in Table 8.8 along with 
the mnemonic that corresponds to each. 



10 10 


SRC2 


DEST 


SRC1 


P 


D 


s 


R 


OPCODE 



SRC1 , SRC2 — Source; one of 32 floating-point registers 
DEST —Destination register 

(instructions other than fxfr) one of 32 floating-point registers 

(fxfr) one of 32 integer registers 



Pipelining 

1 — Pipelined instruction mode 

— Scalar instruction mode 
Dual-Instruction Mode 

1 — Dual-instruction mode 
— Single-instruction mode 



Source Precision 

1 — Double-precision source operands 

— Single-precision source operands 
Result Precision 

1 —Double-precision result 
—Single-precision result 



Figure 8.4. Floating-Point Instruction Encoding 

Table 8.7. Floating-Point Opcodes 
6 



pfam Add and Multiply* 
pfmam Multiply with Add* 
pfsm Subtract and Multiply* 
pfmsm Multiply with Subtract* 











1 


DPC 
DPC 


(p)fmul Multiply 

fmlow Multiply Low 

frcp Reciprocal 

frsqr Reciprocal Square Root 

pfmul3.dd 3-Stage Pipelined Multiply 






























1 





1 
1 






1 



1 




(p)fadd Add 
(p)fsub Subtract 
(p)fix Fix 
(p)famov Adder Move 
pf gt/pf le* * Greater Than 
pfeq Equal 
(p)ftrunc Truncate 






















1 







1 
1 







1 
1 




1 




1 



1 



1 




fxfr Transfer to Integer Register 
(p)fiadd Long-Integer Add 
(p)fisub Long-Integer Subtract 
















1 
1 





1 









1 
1 


(p)fzchkl Z-Check Long 
(p)fzchks Z-Check Short 
(p)faddp Add with Pixel Merge 
(p)faddz • Add with Z Merge 
(p)form OR with MERGE Register 















1 




1 


1 
1 






1 

1 




1 


1 
1 



1 





*pfam and pfsm have P-bit set; pfmam and pfmsm have P-bit clear. 
**pfgt has R bit cleared; pfle has R bit set. 

NOTE: , 

All opcodes not shown are reserved. 
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The following table shows the opcode mnemonics that generate the various encodings of DPC and explains 
each encoding. 









Table 8.8. 


DPC Encoding 








DPC 


PFAM 


PFSM 


M-Unit 


M-Unit 


A-Unit 


A-Unit 


T 


K 


Mnemonic 


Mnemonic 


op1 


op2 


op1 


op2 


Load 


Load* 


0000 


r2p1 


r2s1 


KR 


src2 


srd 


M result 


No 


No 


0001 


r2pt 


r2st 


KR 


src2 


T 


M result 


No 


Yes 


0010 


r2ap1 


r2as1 


KR 


src2 


srd 


A result 


Yes 


No 


0011 


r2apt 


r2ast 


KR 


src2 


T 


A result 


Yes 


Yes 


0100 


i2 P 1 


i2s1 


Kl 


src2 


srd 


M result 


No 


No 


0101 


i2pt 


i2st 


Kl 


src2 


T 


M result 


No 


Yes 


0110 


i2ap1 


i2as1 


Kl 


src2 


srd 


A result 


Yes 


No 


0111 


i2apt 


i2ast 


Kl 


src2 


T 


A result 


Yes 


Yes 


1000 


rat1p2 


rat1s2 


KR 


A result 


srd 


src2 


Yes 


No 


1001 


m12apm 


m12asm 


srd 


src2 


A result 


M result 


No 


No 


1010 


ra1p2 


ra1s2 


KR 


A result 


srd 


src2 


No 


No 


1011 


m12ttpa 


m12ttsa 


srd 


src2 


T 


A result 


Yes 


No 


1100 


iat1p2 


iat1s2 


Kl 


A result 


srd 


src2 


Yes 


No 


1101 


m12tpm 


m12tsm 


srd 


src2 


T 


M result 


No 


No 


1110 


ia1p2 


ia1s2 


Kl 


A result 


srd 


src2 


No 


No 


1111 


m12tpa 


m12tsa 


srd 


src2 


T 


A result 


No 


No 












DPC 


PFMAM 


PFMSM 


M-Unit 


M-Unit 


A-Unit 


A-Unit 


T 


K 


Mnemonic 


Mnemonic 


opt 


op2 


op1 


op2 


Load 


Load* 


0000 


mr2p1 


mr2s1 


KR 


src2 


srd 


M result 


No 


No 


0001 


mr2pt 


mr2st 


KR 


src2 


T 


M result 


No 


Yes 


0010 


mr2mp1 


mr2ms1 


KR 


src2 


srd 


M result 


Yes 


No 


0011 


mr2mpt 


mr2mst 


KR 


src2 


T 


M result 


Yes 


Yes 


0100 


mi2p1 


mi2s1 


Kl 


src2 


srd 


M result 


No 


No 


0101 


mi2pt 


mi2st 


Kl 


src2 


T 


M result 


No 


Yes 


0110 


mi2mp1 


mi2ms1 


Kl 


src2 


srd 


M result 


Yes 


No 


0111 


mi2mpt 


mi2mst 


Kl 


src2 


T 


M result 


Yes 


Yes 


1000 


mrmt1p2 


mrmt1s2 


KR 


M result 


srd 


src2 


Yes 


No 


1001 


mm12mpm 


mm12msm 


srd 


src2 


M result 


M result 


No 


No 


1010 


mrm1p2 


mrm1s2 


KR 


M result 


srd 


src2 


No 


No 


1011 


mm12ttpm 


mm12ttsm 


srd 


src2 


T 


A result 


Yes 


No 


1100 


mimt1p2 


mimt1s2 


Kl 


M result 


srd 


src2 


Yes 


No 


1101 


mm12tpm 


mm12tsm 


srd 


src2 


T 


M result 


No 


No 


1110 


mim1p2 


mim1s2 


Kl 


M result 


srd 


src2 


No 


No 


1111 








Intel-Reserv 


ed 










*lf K-load is set, KR is loaded when operand-1 of the multiplier is KR; Kl is loaded when operand-1 of the multiplier is Kl. 
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8.3 Instruction Timings 

i860 XR microprocessor instructions take one clock 
to execute unless a freeze condition is invoked. 
Freeze conditions and their associated delays are 



shown in the table below. Freezes due to multiple 
simultaneous cache misses result in a delay that is 
the sum of the delays for processing each miss by 
itself. Other multiple freeze conditions usually add 
only the delay of the longest individual freeze. 



Freeze Condition 



Delay 



Instruction-cache miss 



Reference to destination of Id instruction that 
misses 



fid miss 



call, calli, ixfr, fxfr, Id.c, or st.c and data cache 
load miss processing in progress 



Id/st/pf Id/f Id/fst and data cache load miss 
processing in progress 

Reference to dest of Id, call, calli, fxfr, or Id.c in 
the next instruction. {Destoi call and calli is r1.) 



Number of clocks to read instruction (from ADS 
clock to first READY # clock) plus time to last 
READY # of block when jump or freeze occurs 
during miss processing plus two clocks if data- 
cache being accessed when instruction-cache 
miss occurs. 

One plus number of clocks to read data (from 
ADS# clock to first READY # clock) minus number 
of instructions executed since load (not counting 
instruction that references load destination) 

One plus number of clocks until first READY # 
returned (for 32- or 64-bit read cycles) or until 
second READY# returned (for 128-bit fld.q read 
cycles) 

One plus number of clocks until first READY # 
returned (for 64-bit read cycles) or until second 
READY # returned (for 128-bit fld.q read cycles) 

One plus number of clocks until last READY # 
returned 

One clock 
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Freeze Condition 


Delay 


Reference to destoi fld/pfld/ixfr in the next two 
instructions 


Two clocks in the first instruction; one in the 
second instruction 


bc/bnc/bc.t/bnc.t following addu/adds/subu/ 
subs/pfeq/pfle/pfgt 


One clock 


Fsrd of multiplier operation refers to result of 
previous operation 


One clock 


Floating-point operation or graphics-unit 
instruction or fst, and scalar operation in progress 
other than f rep or f rsqr 


If the scalar operation is fadd, fix, fmlow, f mul.ss, 
fmul.sd, ftrunc, or fsub, two minus the number of 
instructions (or dual-mode pairs) already executed 
after the scalar operation. If the scalar operation is 
fmul.dd, three minus the number of instructions 
(or dual-mode pairs) executed after it. Add one if 
either or both of these two situations occur: 




1 . There is an overlap between the result register 
of the previous scalar operation and the source 
of the floating-point operation, and the 
destination precision of the scalar operation is 
different than the source precision of the 
floating-point operation. 

2. The floating-point operation is pipelined and its 
destination is not fO. 




There is no delay if the result is negative. 


Multiplier operation preceded by a double 
precision multiply 


One clock 


TLB miss 


Five plus the number of clocks to finish two reads 
plus the number of clocks to set A-bits (if 
necessary) 


pfld when three pf Id's are outstanding 


One plus the number 6f clocks to return data from 
first pfld 


pfld hits in the data cache 


Two plus the number of clocks to finish all 
outstanding accesses 


st, pst or fst miss, Id miss, or flush with modified 
block when store path full (two stores or one 256- 
bit write-back internally waiting for bus plus 
external bus pipeline full) 


One plus the number of clocks until READY # 
active on next 64-bit write cycle or second 
READY# of next 128-bit write cycle. 


Id, fid, pfld, st, pst, or fst when address path full 
(one address internally waiting for bus plus 
external bus pipeline full) 


Number of clocks until next nonrepeated address 
can be issued (i.e., an address that is not the 2nd- 
4th cycle of a cache fill, the 2nd-8th cycle of a 
CS8 mode instruction fetch, nor the 2nd cycle of a 
128-bit write) 


Id/f Id following st/fst hit 


One clock 
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Freeze Condition 


Delay 


Delayed branch not taken 


One clock 


Nondelayed branch taken: 
be, bnc 
bte, btne 


One clock 
Two clocks 


Indirect branch bri or call calli 


One clock 


st.c 


Two clocks 


Result of graphics-unit instruction (other than 
fmov.dd) used in next instruction when the next 
instruction is an adder- or multiplier-unit instruction 


One clock 


Result of graphics-unit instruction used in next 
instruction when the next instruction is a graphics- 
unit instruction 


One clock 


flush followed by flush 


Three clocks minus the number of instructions 
between the two flush instructions. There is no 
delay if the result is negative. 


fst or pst followed by pipelined floating-point 
operation that overwrites the register being stored 


One clock 



8.4 Instruction Characteristics 

The following table lists some of the characteristics 
of each instruction. The characteristics are: 

• What processing unit executes the instruction. 
The codes for processing units are: 

A Floating-point adder unit 

E Core execution unit 

G Graphics unit 

M Floating-point multiplier unit 

• Whether the instruction is pipelined or not. A P 
indicates that the instruction is pipelined. 

• Whether the instruction is a delayed branch in- 
struction. A D marks the delayed branches. 

• Whether the instruction changes the condition 
code CC. A CC marks those instructions that 
change CC. 

• Which faults can be caused by the instruction. 
The codes used for exceptions are: 

IT Instruction Fault 

SE Floating-Point Source Exception 

RE Floating-Point Result Exception, including 

overflow, underflow, inexact result 
DAT Data Access Fault 

Note that this is not the same as specifying at 
which instructions faults may be reported. A re- 
sult exception is reported on the subsequent 
floating-point instruction, pst, fst, or sometimes 
fid, pfld, and ixfr. 



The instruction access fault IAT and the interrupt 
trap IN are not shown in the table because they 
can occur for any instruction. 

Performance notes. These comments regarding 
optimum performance are recommendations 
only. If these recommendations are not followed, 
the i860 XR microprocessor automatically waits 
the necessary number of clocks to satisfy internal 
hardware requirements. The following notes de- 
fine the numeric codes that appear in the instruc- 
tion table: 

1 . The following instruction should not be a con- 
ditional branch (be, bnc, bc.t, or bnc.t). 

2. The destination should not be a source oper- 
and of the next two instructions. 

3. A load should not directly follow a store that is 
expected to hit in the data cache. 

4. When the prior instruction is scalar, fsrd 
should not be the same as the fdest of the 
prior operation. 

5. The fdest should not reference the destination 
of the next instruction if that instruction is a 
pipelined floating-point operation. 

6. The destination should not be a source oper- 
and of the next instruction. (For call and calli, 
the destination is M.) 
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7. When the prior operation is scalar and multipli- 
er op1 is fsrd, fsrc2 should not be the same 
as the fdest of the prior operation. 

8. When the prior operation is scalar, fsrd and 
fsrc2o\ the current operation should not be the 
same as fdest of the prior operation. 

9. A pfld should not immediately follow a pfld. 

Programming restrictions. These indicate combi- 
nations of conditions that must be avoided by 
programmers, assemblers, and compilers. The 
following notes define the alphabetic codes that 
appear in the instruction table: 

a. The sequential instruction following a delayed 
control-transfer instruction may not be another 
control-transfer instruction (except in the case 
of external interrupts), nor a trap instruction, 
nor the target of a control-transfer instruction. 

b. When using a bri to return from a trap handler, 
programmers should take care to prevent traps 
from occurring on that or on the next sequen- 
tial instruction. IM should be zero (interrupts 
disabled) when the bri is executed. 

c. If fdest is not zero, fsrd must not be the same 
as fdest 

d. When fsrd goes to the multiplier op1, KR, or 
Kl, fsrd must not be the same as fdest. 

e. If fdest is not zero, fsrd and fsrc2 must not be 
the same as fdest. 

f. isrd must not be the same as isrc2 for the 
autoincrementing form of this instruction. 

g. isrd must not be the same as isrc2. 

Core and Floating-Point Instruction Interaction in 
Dual-Instruction Mode 

1 . If one of the branch-on-condition instructions 
be or bnc is paired with a floating-point com- 
pare, the branch tests the value of the condi- 
tion code prior to the compare. 



2. If an ixfr, fid, or pfld loads the same register 
as a source operand in the floating point in- 
struction, the floating-point instruction refer- 
ences the register value before the load up- 
dates it. 

3. An fst or pst that stores a register that is the 
destination register of the companion pipe- 
lined floating-point operation will store the re- 
sult of the companion operation. 

4. When the core instruction sets CC and the 
floating-point instruction is pfgt, pfle, or pfeq, 
CC is set according to the result of pfgt, pfle, 
or pfeq. 

5. When a trap instruction causes a trap in dual- 
instruction mode, the floating-point instruction 
has neither completed execution nor has up- 
dated the FT bit or any result status bits. This 
is not a problem when the trap is inserted by a 
debugger, because the trap is replaced by the 
original instruction, and the dual-mode pair is 
reexecuted. However, when the trap is pro- 
grammed, the trap handler must avoid reexe- 
cuting the trap by returning to user code at 
the address in fir + 8. In this case, the trap 
handler must emulate the floating-point in- 
struction before returning to the user code. 
Emulation of the instruction must include all 
side-effects (for example, the effect of its 
D-bit, effect on the pipelines, and effect on FT 
and result-status bits), just as if the instruction 
had been executed by the processor in the 
original context. 

6. In dual-instruction mode, when the intovr in- 
struction causes a trap, the floating-point com- 
panion instruction has completely finished ex- 
ecution before the trap is taken. 
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Programming Restrictions for Dual-Instruction 
Mode 

1. The result of placing a core instruction in the 
low-order 32 bits or a floating-point instruction 
in the high-order 32 bits is not defined (except 
for shrd rO, rO, rO which is interpreted as 
fnop). 

2. A floating-point instruction that has the D-bit 
set must be aligned on a 64-bit boundary (i.e., 
the three least-significant bits of its address 
must be zero). This applies as well to the initial 
32-bit floating-point instruction that triggers 
the transition into dual-instruction mode, but 
does not apply to the following instruction. 

3. When the floating-point operation is scalar 
and the core operation is fst or pst, the store 
should not reference the result register of the 
floating-point operation. When the core opera- 
tion is pst, the floating-point instruction can- 
not be (p)fzchks or (p)fczhkl. 

4. When the core instruction of a dual-mode pair 
is a control-transfer operation and the previ- 
ous instruction had the D-bit set, the floating- 
point instruction must also have the D-bit set. 
In other words, an exit from dual-instruction 
mode cannot be initiated (first instruction pair 
without D-bit set) when the core instruction is 

1 a control-transfer instruction. 

5. When the core operation is a Id.c or st.c, the 
floating-point operation must be d.fnop. 

6. When the floating-point operation is fxfr, the 
core instruction cannot be Id, Id.c, st, st.c, 
call Ixfr, or any instruction that updates an in- 
teger register (including autoincrement index- 
ing). Furthermore, the core instruction cannot 
be a fid, fst, pst, or pfld that uses as isrd or 
isrc2 the same register as the iciest of the 
fxfr. Additionally, in dual instruction mode, 



fxfr may not be used in a branch delay slot if 
its destination register is referenced by the 
preceding branch instruction. 

7. A bri must not be executed in dual-instruction 
mode if any trap bits are set. 

8. When the core operation is bet or bnc.t, the 
floating point operation cannot be pfeq or 
pfgt. The floating-point operation in the se- 
quentially following instruction pair cannot be 
pfeq or pfgt, either. 

9. A transition to or from dual-instruction mode 
cannot be initiated on the instruction following 
a bri. 

10. An ixfr, fid, or pfld cannot update the desti- 
nation of the companion floating-point in- 
struction (unless the destination is fO or f1) 
or of the following pipelined floating-point in- 
struction (regardless of its destination regis- 
ter). No overlap of register destinations is 
permitted; for example, the following instruc- 
tions must not be paired: 

// Illegal case 1 

d.fmul.ss f9, flO, f5 
fld.d address, f4 
; Overlaps f5 

// Illegal case 2 

d.fmul.ss fO, fO, f3 
fld.q address, fO 
; Overlaps f3 

// Illegal case 3 

d.fmul.ss f9, flO, fll 

fld.l address, f5 
d.pfadd.ss fx, fx, f4 

; Overlaps f5, if last 
stage result is double- 
precision 

1 1 . During a locked sequence, a transition to or 
from dual-instruction mode is not permitted. 
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Table 8.9 Instruction Characteristics 



Instruction 


Execution 
Unit 


Pipelined? 
Delayed? 


Sets 
CC? 


Faults 


Performance 
Notes 


Programming 
Restrictions 


adds 

addu 

and 

andh 

andnot 


E 
E 
E 
E 
E 




cc 

CC 

cc 
cc 
cc 




1 
1 




andnoth 

be 

bet 

bla 

bnc 


E 
E 
E 
E 
E 


D 
D 


cc 






a 
a,g 


bnc.t 

br 

bri 

bte 

btne 


E 
E 
E 
E 
E 


D 
D 
D 








a 

a 

a,b 


call 

calli 

fadd.p 

faddp 

faddz 


E 
E 
A 
G 
G 


D 
D 




SE,RE 


6 
6 

8 
8 


a 
a 


famov.r 

fiadd.z 

fisub.z 

fix.p 

fld.y 


A 
G 
G 
A 

E 






SE,RE 

SE,RE 
DAT 


8 
8 

2,3 


f 


flush 

fmlow.p 

fmul.p 

form 

frep.p 


E 
M 
M 
G 
M 






SE, RE 
SE.RE 


4 
4 
8 




frsqr.p 

fst.y 

fsub.p 

ftrunc.p 

fxfr 


M 
E 
A 
A 
G 






SE, RE 

DAT 
SE.RE 
SE.RE 


5 
6,8 


f 


fzchkl 

fzchks 

intovr 

ixfr 

Id.c 


G 
G 
E 

E 
E 






IT 


8 
8 

2 




Id.x 

° r 
orh 

pfadd.p 
pfaddp 


E 
E 
E 
A 
G 


P 
P 


cc 
cc 


DAT 
SE, RE 


6 
8 


e 
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Table 8.9 Instruction Characteristics (Continued) 



Instruction 


Execution 
Unit 


Pipelined? 
Delayed? 


Sets 
CC? 


Faults 


Performance 
Notes 


Programming 
Restrictions 


pf addz 
pfam.p 
pfamov.r 
pfeq.p 

pfgt.p 

pfiadd.z 


G 
A&M 
A 
A 
A 
"G 


P 
P 
P 
P 
P 
P 


cc 

CC 


SE.RE 
SE.RE 

SE 

SE 


8 

7 

1 
1 
8 


e 
d 

e 


pfisub.z 
pfix.p 
pfld.z 
pfmam.p 


G 
A 

E 
A&M 


P 
P 
P 
P 




SE, RE 

DAT 
SE, RE 


8 

2,9 
7 


e 

f 
d 


pfmsm.p 

pfmul.p 

pfmul3.dd 

pform 

pfsm.p 

pfsub.p 


A&M 

M 

M 

G 
A&M 

A 


P 
P 
P 
P 
P 
P 




SE.RE 
SE,RE 
SE.RE 

SE.RE 
SE, RE 


7 
4 
4 
8 
7 


d 
c 
c 
e 
d 


pftrunc.p 

pfzchkl 

pfzchks 

pst.d 

shl 


A 
G 
G 
E 
E 


P 
P. 
P 




SE, RE 
DAT 


8 
8 


f 


shr 

shra 

shrd 

st.c 

stx 


E 

E 
E 
E 

E 






DAT 






subs 

subu 

trap 

xor 

xorh 


E 
E 
E 

E 
E 




cc 
cc 

cc 
cc 


IT 


1 
1 





DATA SHEET REVISION REVIEW 

The following list represents the key differences be- 
tween version 002 and version 001 of the i860 XR 
Microprocessor Data Sheet. 

1. Big-endian description in section 2.3 has been 
expanded. 

2. Bit 17 of the Extended Processor Status Regis- 
ter (EPSR) is the INT bit which reflects the value 
on the interrupt pin (INT), as described in sec- 
tion 2.2.4 entitled "EXTENDED PROCESSOR 
STATUS REGISTER". This is a documentation 
update only. 

3. The cacheability of a page is controlled by 
NOR'ing the value of the CD, WT bits and the 



KEN# input pin, as described in section 2.5 en- 
titled "Caching and Cache Flushing" and sec- 
tion 3.1.14 entitled "Cache Enable (KEN #)". 
This is a documentation update only. 

4. The NOTE section in section 2.5 entitled "Cach- 
ing and Cache Flushing" has been updated to 
clarify the paging requirement on changing the 
DTB field in the dirbase register. 

5. Information on register encoding is added in 
section 8.2 entitled "Instruction Format and En- 
coding". This is a documentation update only. 

The following list represents the key differences be- 
tween version 003 and version 002 of the i860 XR 
Microprocessor Data Sheet. 
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Specification Changes: 

1 . Specification changes for improved AC perform- 
ance are in section 7.3. 

2. HOLD is acknowledged during locked bus cy- 
cles. See section 3.1.8. 

3. Additional paths have been added to the bus 
state diagram to allow direct transitions from 
states T1 2 and T1 1 to state TH. See Figures 4.1 
and 4.10. 

4. Two new instructions, (p)famov.r, have been 
added. These replace (p)fadd.ds and 
(p)fadd.sd in the assembler pseudo-ops 
(p)fmov.r. These changes are in section 8.1 
and tables 2.7, 8.7, and 8.9. 

Documentation Changes: 

1. Big and little endian description has been ex- 
panded in sections 2.2.2, 2.3, and Figure 2.8. 

2. The actions and explanations of the lock, un- 
lock, and st.c dirbase changing the BL bit have 
been updated in sections 2.2.4, 3.1.5, 3.1.8, 
4.3.4,4.3.5, and 8.1. 

3. The explanation of the AA and MA bits of the 
fpsr have been expanded in section 2.2.8. 

4. The explanation of the WT bit of the Page Table 
Entries has been expanded in sections 2.4.4.4 
and 2.5. 

5. A change concerning the locking of the bus dur- 
ing address translation is explained in sections 
2.4.5 and 2.8.5. 

6. A further explanation on when to flush the data 
cache is given in section 2.5. 

7. The explanation of the floating point multiplier 
pipeline has been expanded in section 2.6.1. 

8. The explanation of BREQ has been expanded 
in section 3.1.4 and Figure 4.1. 

9. The explanation of result exceptions has been 
expanded in sections 2.8 and 3.2. 

10. Instruction fetch identification has been clarified 
in section 3.1.6 and table 3.2. 

11. Bus cycle diagrams in Figures 4.7, 4.8, and 4.10 
have been clarified/corrected. 

12. Precision specification .r has been added to 
section 8.0 and table 8.1. 

13. In section 8.4, performance note 9 has been 
added, programming restriction d has been 
changed, and programming restriction f has 
been added. Table 8.9 has been updated to re- 
flect these changes. 

14. The description of testability has changed in 
sections 3.3. and 3.3.2. RESET and HOLD must 
be asserted by the tester to force the chip out- 
puts to float (tri-state). 



The following list represents the major differences 
between version 004 and version 003 of the i860 XR 
Microprocessor Data Sheet: 

Section 2.2.4 The explanation of the WP bit of the 
espr has been expanded. 

Section 2.8.2 More information on the instruction 
trap has been added. 

Section 2.8.4 The instruction access trap has been 
clarified. 

Section 2.8.7 The values of registers after a reset 
trap have been specified. 

Section 3.1.4 BREQ timing has been clarified. 

Section 3.1.5 The calculation of interrupt latency 
has bee corrected. 

Section 3.1.6 The description of the byte-enable 
signals has been expanded. 

Section 3.1.8 The relation between the lock 
instruction and the LOCK# signal has 
been clarified. The BL bit should no 
longer be changed by writing to the 
dirbase register. 

Section 6.0 The thermal specifications have been 
updated. 

Section 7.3 The A.C. Characteristics for CLK have 
changed. 

Section 7.3 Advance timing information for the 50 
MHz clock rate has been added. 
These timings are subject to change 
without notice. 

Section 8.0 The operand naming conventions 
have improved. 

Section 8.2.1 The encoding of the flush instruction 
has been corrected. 

Section 8.3 The data-dependent multiplier freeze 
has been eliminated. Other freeze 
conditions have been corrected or 
clarified. 

The following list represents the major differences 
between version 005 and version 004 of the i860 XR 
Microprocessor Data Sheet. 

Section 2.2.4 OF bit is writable only in supervisor 
mode using ST.C. 

Section 3.1.1 CLK rate has been updated. 

Section 5.0 Figure 5.3 has been corrected. 

Section 6.0 More information on measuring case 
temperature has been added. 

Section 6.0 Figure 6.1' has been updated to in- 
clude 25 MHz. 

Section 6.0 Table 6.1 has been corrected. 

Section 6.0 Table 6.2 has been updated to in- 
clude 25 MHz. 
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Section 7.2 The D.C. Characteristics have been Section 7.3 25 MHz A.C. Specifications have been 

updated to include 25 MHz power sup- added, 

ply current. Section 7.3 Figure 7.1 has been corrected. 

Section 7.3 The A.C. Characteristics for CLK have Section 8.3 The data-dependent multiplier round- 
been changed. j ng f reeze has been eliminated. 

Section 7.3 50 MHz clock rate has been deleted. Section Q4 programming restrictions for dual-in- 
struction mode are added. 



2-242 



Intel 



IF^IUfiMAGW 



82495XP CACHE CONTROLLER/ 
82490XP CACHE 



\B 



Two-Way, Set Associative, Secondary 
Cache for i860TM XP Microprocessor 

50 MHz "No Glue" Interface with CPU 

Configurable 

— Cache Size 256 or 512 Kbytes 

— Line Width 32, 64 or 128 Bytes 

— Memory Bus Width 64 or 128 Bits 

Dual-Ported Structure Permits 
Simultaneous Operations on CPU and 
Memory Buses 

Efficient MRU Way Prediction 

— Zero Wait States on MRU Hit 

— One Wait State on MRU Miss 

Dynamically Selectable Update Policies 

— Write-Through 

— Write-Once 

— Write-Back 



m MESI Cache Consistency Protocol 

eh Hardware Cache Snooping 

m Maintains Consistency with Primary 
Cache via Inclusion Principle 

m Flexible User-Implemented Memory 
Interface Enables Wide Range of 
Product Differentiation 

— Clocked or Strobed 

— Synchronous or Asynchronous 

— Pipelining 

— Memory Bus Protocol 

□ 82495XP Cache Controller Available in 
208-Lead Ceramic Pin Grid Array 
Package 

□ 82490XP Cache RAM Available in 84- 
Lead Plastic Quad Flatpack Package 

(See Packaging Handbook, Order #240800) 




The Intel 82495XP cache controller and 82490XP cache RAM, when coupled with a user-implemented memo- 
ry bus controller, provide a second-level cache subsystem that eliminates the memory latency and bandwidth 
bottleneck for a wide range of multiprocessor systems based on the i860 XP microprocessor. The CPU 
interface is optimized to serve the i860 XP microprocessor with zero wait states at up to 50 MHz. A secondary 
cache built from the 82495XP and 82490XP isolates the CPU from the memory subsystem; the memory can 
run slower and follow a different protocol than the i860 XP microprocessor. 
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Figure 0-1. Secondary Cache Configuration 
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1.0 82495XP/82490XP PINOUTS 



1 
2 
3 
4 
5 
6 
7 
8 
9 

10 
11 
12 
13 
14 
15 
16 
17 



A 


B 


c 


D 


E 


F 


G 


H J K 


L 


M 


N 


p 


Q 


R 


s 


TAG9 


O 

TAG 10 




RDYSRC 


O 

FSIOUT* 


O 

V C c 


o 

SNPBSY* 




Vcc 


O O O 

MRO* 

v cc v cc 


O 

Vcc 


o 

DRCTM* 


o 

Vcc 


o 

BRDY# 


o 

SWEND* 


O 

MKEN* 




NC 


o 

TAC7 


o 

CFA3 


o 

MCACHE* 


o 

PALLC* 


o 

Vss 


o 

Vss 


o 

Vss 


o o o 

v ss 
Vss v ss 


o 

v S s 


o 

CRDY# 


o 

Vss 


o 

TMS 


o 

MALE 


o 

Vss 


o 

NC 


o 

TAG3 


o 

TAG5 


o 

KLOCK* 


o o 

CADS* 
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Figure 1-1. 82495XP Pinout (Bottom View) 
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Figure 1-2. 82495XP Pinout (Top View) 
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Figure 1-3. 82490XP Pinout (Top View) 
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Figure 1-4. 82490XP Pinout (Bottom View) 
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1.1 Pin Cross Reference Tables 

Table 1-1. 82495XP Pin Cross Reference by Name 



Signal 


Location 


Signal 


Location 


Signal 


Location 


ADS# 


B15 


AHOLD 


A17 


BGT# 


M03 


BLAST# 


C15 


BLE# 


C16 


BOFF#[CLEN0] 


G15 


BRDY# 


P01 


BRDYC1 # 


D15 


BRDYC2# 


Fi4 


BUS# 


P16 


CACHE # 


G14 


CADS# 


E03 


CAHOLD 


G04 


CDC# 


D03 


CDTS# 


F04 


CFAO 


E15 


CFA1 


B14 


CFA2# 


D06 


CFA3 


B02 


CFA4 


A16 


CFA5 


E14 


CFA6 


D14 


CLK 


D11 


CMIO# 


D04 


CNA#[CFG0] 


L04 


CRDY#[SLFTST#] 


M02 


CWAY 


J03 


CWR# 


E04 


DC# 


H14 


DRCTM# 


M01 


EADS# 


J15 


FLUSH#[NCPFLD#] 


N04 


FPFLD#[FPFLDEN] 


J04 


FSIOUT# 


D01 


HITM#[CPUTYP] 


D17 


INV[CLEN1] 


K15 


KEN# 


D16 


KLOCK# 


C03 


KWEND#[CFG2] 


M04 


LEN 


F15 


LOCK# 


B16 


MALE[WWOR#] 


Q02 


MAOE# 


S04 


MAWEA# 


Q17 


MBALE[HIGHZ#] 


P04 


MBAOE# 


P06 


MCACHE# 


C02 


MCFAO 


Q16 


MCFA1 


N14 


MCFA2 


R04 


MCFA3 


Q06 


MCFA4 


P15 


MCFA5 


P14 


MCFA6 


P13 


MCYC# 


P17 


MHITM# 


H04 


MIO# 


F16 


MKEN# 


R01 


MRO# 


J01 


MSETO 


Q15 


MSET1 


P12 


MSET10 


Q11 


MSET2 


P11 


MSET3 


Q14 


MSET4 


R16 


MSET5 


Q13 


MSET6 


R17 


MSET7 


S17 


MSET8 


P10 


MSET9 


Q12 


MTAGO 


Q10 


MTAG1 


P09 


MTAG10 


Q07 


MTAG11 


P07 


MTAG2 


Q09 


MTAG3 


R14 


MTAG4 


Q08 


MTAG5 


R15 


MTAG6 


S14 


MTAG7 


S15 


MTAG7 


S17 


MTAG8 


P08 


MTAG9 


S16 


MTHIT# 


G03 


MWBWT# 


K03 


NA# 


J17 


NENE# 


D05 


PALLC# 


D02 


PCD 


H15 


PCYC 


J14 


PWT 


C17 


RDYSRC 


C01 


RESET 


Q05 


SETO 


D13 


SET1 


C13 


SET10 


A09 


SET2 


C14 


SET3 


B12 


SET4 


C12 


SET5 


C11 


SET6 


D12 


SET7 


D09 


SET8 


D10 


SET9 


B09 


SMLN#. 


C06 


SNPADS# 


F03 


SNPBSY# 


F01 


SNPCLK[SNPMD] 


S03 


SNPCYC# 


H03 


SNPINV 


P05 


SNPNCA 


Q03 
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Table 1-1. 82495XP Pin Cross Reference by Name (Continued) 



Signal 


Location 


Signal Location 


Signal Location 


SNPSTB# 


R03 


SWEND#[CFG1] Q01 


SYNC#[MEMLDRV] Q04 


TAGO 


C08 


TAG1 A04 


TAG10 B01 


TAG 1.1 


C05 


TAG2 D08 


TAG3 A03 


TAG4 


B04 


TAG5 B03 


TAG6 C07 


TAG7 


A02 


TAG8 D07 


TAG9 A01 


TCK 


P03 


TDI N03 


TDO C04 


TMS 


P02 


WAY L15 


WBA M14 


WBTYP 


N15 


WBWE# M15 


WBWT#[WRMRST] K14 


WR# 


B17 


WRARR# L14 




NC A14,A15,S01,S02 


Vcc A05-A08, A10-A13, E01, E17, 
H01.H1 7 f K01,K17,L01 f L17, 
C09, N1 7, F17.G01, G1 7, 
M17, N01.S05-S13 


Vss B05-B08, B10-B11.B13, E02, 
E16, F02, H02, H16,J02,J16, 
K02, K04, K16, L02-L03, L16, 
C10, N16, G02, G16, R02, R05- 
R10, M16, N02, R11-R13 



Table 1-2. 82490XP Pin Cross Reference by Name 



Signal 


Location 


Signal 


Location 


Signal 


Location 


A0 


65 


A1 


66 


A10 


77 


A11 


78 


A12 


79 


A13 


80 


A14 


81 


A15 


82 


A2 


67 


A3 


68 


A4 • 


69 


A5 


70 


A6 


71- 


A7 


73 


A8 


75 


A9 


76 


ADS# 


63 


BE# 


64 


BLAST # 


59 


BOFF# 


36 


BRDY# 


60 


BRDYC# 


61 


BUS# 


40 


CDATA0 


48 


CDATA1 


54 


CDATA2 


49 


CDATA3 


55 


CDATA5 


51 


CDATA6 


52 


CDATA7 


57 


CDATA4 


46 


CLK 


30 


CRDY# 


43 


HITM# 


62 


MAWEA# 


41 


MBRDY#[MISTB] 


22 


MCLK[MSTB##] 


26 


MCYC# 


42 


MDATA0 


18 


MDATA1 


14 


MDATA2 


10 


MDATA3 


6 


MDATA4 


16 


MDATA5 


12 


MDATA6 


8 


MDATA7 


4 


MDOE# 


20 


MEOC# 


23 


MFRZ#[MEMLDRV] 


24 


MOCLK[MOSTB] 


27 


MSEL#[MTR4#/.. 


.]■ 25 


MZBT#[MX4#/...] 


21 


PAR# 


32 


RESET 


28 


TCK 


3 


TDI 


.2 


TDO 


84 


TMS 


1 


WAY 


45 


WBA 


38 


WBTYP 


37 


WBWE# 


39 


WR# 


58 


WRARR'*' 


44 






NC 


83 


Vcc 5,9,13,17,29 
56, 74 


35,50, 


Vss 7,11,15,19,31 
53,72 


33,34,47, 
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1.2 Quick Pin Reference 



BGT#[C490LDRV] 


I 


Bus Guaranteed Transfer, [82490XP Low Drive] 

This signal is generated by the MBC to the 82495XP. It indicates to the 

82495XP a commitment by the MBC to complete the cycle on the memory 

bus. Until BGT# activation the 82495XP owns the cycle and will abort it if 

intervening snoops happen. After BGT# the cycle is owned by the MBC until 

its completion. From BGT# until SWEND# snoops will be accepted, but none 

will be processed until SWEND# activation. 

During RESET'S falling edge, this signal controls the driver's strength of the 

82495XP to 82490XP interface signals. This strength is a function of the 

cache size, and therefore the number of 82490XP's. Refer to the layout 

specifications section for more details. 


BLE# 





BE Latch Enable 

The BLE# signal is used to control the enable line of an external '377-type 
latch. The latch captures the i860 XP CPU's BE (Byte Enable) signals and 
other CPU provided cycle attributes which do not go through the 82495XP. 


BRDY# 


I 


82495XP Burst Ready 

This is the burst ready indication from the memory bus controller. The MBC 
should connect its burst ready indication to the CPU BRDY#, the 82495XP 
BRDY# and the 82490XP BRDY#. In the CPU, it provides the same function 
as that described in the CPU data sheet. The 82495XP will only use this 
indication for burst tracking purposes. In the 82490XP, it increments the CPU 
latch burst counter. 


CADS# 





Cache Address Strobe 

This signal is generated by the 82495XP and used by the memory bus 
controller. Its assertion requests execution of a memory bus cycle by the 
memory bus controller. This signal when active indicates that the cache cycle 
control and attribute signals are valid. 


CAHOLD 





82495XP AHOLD 

This signal is generated by the 82495XP to track the CPU AHOLD signal 
when used for warm-reset and LOCKed sequences. It also provides 
information about CPU and cache BIST. 


CD/C# 





Cache Data/Control 

This is a cycle definition signal driven by the 82495XP. It indicates the type of 
memory bus cycle requested. This signal is valid with CADS# and can be 
pipelined by the memory bus controller. 


CDTS# 





Cache Data Strobe 

This signal is driven by the 82495XP to the memory bus controller. CDTS# for 
read cycles indicates that in the next CLK the memory bus controller can 
generate the first BRDY# for the read cycle. For write cycles it indicates 
when data is available on the memory bus. Usage of this signal allows 
complete independency between address strobes (CADS#, SNPADS#) and 
data strobe. 


CFGO-2 


I 


Cache Configuration bits 0-2 

These signals are inputs to the 82495XP. CFGO-2 allow the 82495XP to be 
configured to 5 different modes. Different modes indicate 82495XP/CPU line 
ratio, tag size (4K/8K), lines per sector. 
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1 .2 Quick Pin Reference (Continued) 



CLK 


I 


Clock 

This signal provides the fundamental timing for the 82495XP, 82490XP and 
CPU. It must be provided to the 82495XP, 82490XPs, CPU and memory bus 
controller components with minimal skew. 


CM/!0# 





Cache Memory/ IO 

This signal is driven by the 82495XP and is a cycle definition signal. It 
indicates the type of memory bus cycle requested. This signal is valid with 
CADS# and can be pipelined by the memory bus controller. 


CNA#[CFG0] 


I 


82495XP Next Address Enable, [Configuration Pin 0] 

This signal is driven by the memory bus controller and supplied to the 

82495XP. It is used by the memory bus controller to dynamically pipeline 

CADS# cycles. 

During RESET falling edge it functions as the 82495XP CFGO input. 


CRDY#[SLFTST#] 


I 


Cache Memory Bus Ready, [82495XP Self Test] 

This signal is generated by the memory bus controller and informs the 

82495XP and 82490XP that a memory bus cycle has been completed. 

CRDY# activation ends the memory bus cycle. 

During RESET'S falling edge, if this signal is sampled low(active) and 

MBALE is sampled high(active), 82495XP self test will be invoked. 


CWAY 





Cache Way 

CWAY is driven by the 82495XP and is a cycle definition signal that 

indicates to the memory bus controller the WAY to be used by the 

requested cycle. On line-fills it indicates the way the line will be loaded. For 

write-backs it indicates the WAY that was written-back. This signal is valid 

withCADS#. 


CW/R# 





Cache Write/Read 

This signal is driven by the 82495XP and is a 82495XP cycle definition 
signal. It indicates the type of memory bus cycle requested. This signal is 
valid with CADS ■# and can be pipelined by the memory bus controller. 


DRCTM# 


I 


Memory Bus Direct to [M] State 

This signal is an input to the 82495XP. It is the mechanism by which the 
memory bus can dynamically inform the 82495XP of a request to skip the 
[E] state and move the line directly to the [M] state. This signal is sampled 
by the 82495XP when SWEND# is asserted. 


FLUSH #[NCPFLD#] 


I 


Flush the 82495XP cache, [Enable Non-Cacheable PFLD] 

This signal is an input to the 82495XP. Flush when active will cause the 

82495XP to write-back all of its modified lines into main memory then 

invalidate all tag locations. At the end of a flush operation the 82495XP tag 

array will be completely invalidated. 

During RESET activation, this pin functions as the NCPFLD# configuration 

signal which, with FPFLDEN, selects one of three modes for handling 

i860 XP CPU floating point load cycles. 
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1.2 Quick Pain Reference (Continued) 



FPFLD#[FPFLDEN] 


I/O 


FIFO PFLD Enable [PFLD Mode Select] 

During RESET, FPFLDEN and NCPFLDEN # inputs select one of three 
modes to handle i860 XP CPU pipelined floating point load cycles. In the 
mode which supports an external FIFO, the FPFLD# output indicates a 
PFLD cycle to be loaded into the FIFO. 


FSIOUTt? 





Flush/Sync/Initialization Output 

This signal is an output of the 82495XP and indicates the start and end of 
three operations: Flush, Sync, and Initialization. The output is activated 
when the operation internally begins and is de-activated when the 
operation ends. 


KLOCK# 





82495XPLOCK# 

This signal is driven by the 82495XP and indicates to the memory bus 
controller a request to execute atomic read-modify-write sequences. 
KLOCK# is active with the CADS# of the first LOCKed operation and 
remains active until at least the clock following CADS# of the last cycle of 
LOCKed operation. 


KWEND#[CFG2] 


I 


Cacheability Window End, [Configuration Pin 2] 

This signal is generated by the MBC and indicates to the 82495XP that the 

Cacheability Window has expired. At this point the 82495XP will latch the 

memory cacheability signal (MKEN#) and make decisions based on the 

cacheability attribute. MRO# which indicates the Read-Only cycle attribute 

is also sampled at this point. 

During RESET's falling edge this line functions as the CFG2 configuration 

signal which is used to configure the 82495XP/82490XP with cache 

parameters. 


MALE[WWOR#] ; 


I 


Memory Bus, Address Latch Enable [Weak Write Ordering] 
This signal is generated by the memory bus controller, and controls a 
82495XP internal transparent address latch (373 like). CADS# will 
generate a new address at the input of the internal address latch. MALE 
- activation(high) will allow the flowing of this address to the memory bus 
provided MAOE# is active. When MALE inactive(low), the address at the 
latch input is latched. 

WWOR # configures the 82495XP into strong or weak write-ordering 
mode. 


MAOE# 


I 


Memory Bus Address Output Enable 

This signal is generated by the memory bus controller and controls the 
82495XP's output buffer of the memory bus address latches. The 82495XP 
drives the memory bus address lines if MAOE# is active (low). Otherwise, 
it is tristated. MAOE# also serves as a qualifier for snooping cycles: when 
inactive snoops will be enabled. 


MBALE[HIGHZ#] 


I 


Memory Bus, 82495XP sub-line-address Latch Enable [High Impedance 

Output] 

This signal has an exact function as MALE but controls only the 82495XP 

sub-line addresses. This signal is generated by the memory bus controller, 

and controls a 82495XP internal transparent address latch (373 like). 

CADS# will generate a new address at the input of the internal address 

latch. MBALE activation(high) will allow the flowing of the sub-line address 

to the memory bus provided MBAOE# is active. When MALE inactive(low), 

the sub-line address at the latch input is latched. 

HIGHZ#, if active along with SLFTST#, causes the 82495XP to float all of 

its outputs. 
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1.2 Quick Pin Ref 


ference (Continued) 


MBAOE# 


I 


Memory Bus, 82495XP sub-line Address Output Enable 

This signal has a similar function than MAOE#, but controls only the 

82495XP sub-line addresses. 

If MBAOE# is active(low), the 82495XP will drive the sub-line portion of the 

address onto the memory bus. Otherwise, it is tristated. MBAOE# is also 

sampled during snoop cycles. If MBAOE# is sampled inactive with 

SNPSTB#, the snoop write back cycle(if any) will begin at the sub-line 

address provided. If MBAOE# is active with SNPSTB#, the snoop write 

back will begin at sub-line address 0. 


MBRDY#(MISTB) 


I 


Memory Bus Ready, (Memory Input Strobe) 

This pin is an input to the 82490XP. It is used in clocked bus mode to 

indicate the end of a transfer. When active(low) it indicates that the 

82490XP should increment the burst counter and either output the next 

data or get ready to accept the next data. 

In strobed memory bus mode this pin is the input data strobe to the 

82490XP. On each MISTB edge, the 82490XP latches the data and 

increments the burst counter. 


MCACHE# 





82495XP Internal Cacheability 

This signal is driven by the 82495XP. On read cycles, this signal indicates 
the cycle's internal cacheability attribute. In write cycles MCACHE# is only 
active for write-back cycles. MCACHE# is not activated for I/O, special 
cycles and Locked Cycles. 


MCFA6-MCFA0 
MSET10-MSET0 
MTAG11-MTAG0 


I/O 
I/O 
I/O 


Memory Bus Configurable address lines 

Memory bus SET number 

Memory bus TAG bits 

These are the memory bus address lines of the 82495XP and should be 

connected to the A31 -A2 (A31 -A3 for 64 bit bus) signals of the Memory 

Bus. These signals, along with the byte enables, define the physical area of 

memory or I/O accessed. 

The 82495XP drive these signals in normal memory bus cycles and have 

them as inputs during snooping. 


MCLK[MSTBM#] 


I 


Memory Bus Clock, [Memory Input Strobe] 

In clocked memory bus mode this pin provides the memory bus clock to the 

82490XP. In clocked mode, memory bus signals and memory bus data are 

sampled on the rising edge of the MCLK. In a clocked memory bus write, 

data is driven off of MCLK or MOCLK depending upon the configuration. 

This pin is an input to the 82490XP. It is sampled during reset and 

determines the memory bus type. If active(low), the memory bus will be 

strobed. If inactive (high), the memory bus will be clocked. 

If a clock is detected at this input, this pin becomes the memory bus clock, 

and clocked memory bus mode is selected. 


MDATA0-MDATA7 


I/O 


Memory Bus Data 

These pins are the 8 memory data pins of the 82490XP. All or part of these 
pins will be used depending on the cache configuration. In clocked memory 
bus mode, these pins are sampled with the rising edge of MCLK. New data 
is driven out on these pins with MEOC# or the rising edge of MCLK or 
MOCLK together with MBRDY# active. In strobed memory bus mode, 
these pins are sampled on each MISTB edge. New data is driven out on 
these pins with each MOSTB edge. 
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1.2 Quick Pin Reference (Continued) 



MDOE# 


I 


Memory Data Output Enable 

This signal is an input to the 82490XP. The memory bus output enable is 
used to control the 82490XP's driving of data onto the memory bus. When 
this pin is inactive(high), the MDATA[0:7] pins are tristated. When this pin is 
active(low), the MDATA[0:7] pins are actively driving data. The function of 
this pin is the same for strobed or clocked memory bus operation as 
MDOE# has no relation to CLK or MCLK. 


MEOC# 


I 


Memory End of Cycle 

This signal is an input to the 82490XP. Since it is synchronous to the 

memory bus, it may be used to end a cycle on the memory bus and begin a 

pending cycle without waiting for synchronization to the CPU CLK. MEOC# 

also causes the latching or driving of data and resetting of the memory burst 

counter. 


MFRZ#[MEMLDRV] 


I 


Memory Freeze, [Memory Bus Low Drive] 

This signal is an input to the 82490XP. It is used for write cycles that could 

cause allocation cycles. When this pin is active(low), write data is latched in 

the 82490XP. The subsequent allocation will not overwrite data latched by 

the write. This prevents the actual write to memory from having to be 

performed on the memory bus. The allocated line will be placed in the [M] 

state in the cache since memory has not been updated. 

During RESET's falling edge, this signal is sampled to indicate the 

82490XP's memory bus driving strength. The 82490XP provides normal and 

high drive capability buffers. 


MHITM# 





Memory Bus Hit to Modified Line 

This signal is driven by the 82495XP during snoop cycles and indicates 
whether the snooping address hit a Modified line in the 82495XP cache. The 
82495XP automatically schedules the writing-back of modified lines when 
snoop hits occur. MHITM# is activated the CLK after SNPCYC# and will 
remain active until the next SNPSTB # . 


MKEN# 


I 


Memory Bus Cacheability 

This signal is an input to the 82495XP. It is the memory bus cache enable 

pin. It is used to indicate to the 82495XP if the current memory bus cycle is 

cacheable or not. This pin is sampled by the 82495XP with KWEND# 

assertion. 


MOCLK(MOSTB) 


I 


Memory Output Clock, (Memory Output Strobe) 

MOCLK controls a transparent latch at the 82490XP data outputs. By 

providing a clock input, skewed from MCLK, MDATA hold time may be 

increased. 

In strobed bus mode this pin is the data output strobe. On each MOSTB 

edge, new data will be output onto the memory bus. 


MRO# 


I 


Memory Bus Read-Only 

This pin is an input to the 82495XP. It is the READ-ONLY attribute pin. It is 
used to indicate to the 82495XP that the accessed line should get a READ- 
ONLY attribute. READ-ONLY lines will be non-cacheable in the first level 
cache. READ-ONLY lines will be cached in the 82495XP if MKEN# is 
sampled active during KWEND# and will be cached in the [S] state. This pin 
is sampled by the 82495XP with KWEND# assertion. 
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1.2 Quick Pin Reference (Continued) 



MSEL#[MTR4/TR8#] 


I 


Memory Select, [Memory Transfer] 

This signal is a chip select input to the 82490XP. MSEL# activation 

qualifies the MBRDY# input of the 82490XP. MSEL# going active causes 

the sampling of MZBT# for the next cycle. MSEL# going inactive resets 

the 82490XP's internal memory burst counter. 

This pin is used to determine the number of transfers necessary on the 

memory bus for each cache line. If high, there are 4 transfers on the 

memory bus for each cache line. If low, there are 8 transfers on the 

memory bus for each cache line. 


MTHIT# 





Memory Bus Tag Hit 

This signal is driven by the 82495XP during snoop cycles. It indicates 
whether the snooping address hit any line (exclusive, shared, or modified) 
in the 82495XP cache. MTHIT# is activated the CLK after SNPCYC# and 
will remain active until the next SNPSTB#. 


MWB/WT# 


I 


Memory Bus Write Policy 

This signal is an input to the 82495XP. It is the mechanism by which the 
memory bus can dynamically inform the 82495XP of the cycle write policy 
(Write-Through/Write-Back). This signal is sampled by the 82495XP with 
SWEND# activation. 


MZBT#[MX4/MX8#] 


I 


Memory Zero Based Transfer, [Memory I/O Bits] 

This signal is an input to the 82490XP. When this pin is sampled active 

(with MSEL# or MEOC#) it indicates that the memory bus cycle should 

start with burst location zero independent of the sub-line address 

requested by the CPU. 

This pin is used to determine the number of IO pins used for the memory 

bus. When HIGH it indicates that 4 IO pins are used per 82490XP. When 

LOW it indicates that 8 IO pins are used. 


NENE# 





Next Near 

This signal is generated by the 82495XP and indicates to the memory bus 

controller if the address of the requested memory cycle is "near" the 

address of the previously generated one (in the same 2K DRAM page). 

This information can be used by the memory bus controller to optimize 

access to paged or static column DRAMs. This signal is valid together with 

CADS#. 


PALLC# 





Potential Allocate 

This signal is generated by the 82495XP and indicates to the memory bus 
controller that the current write cycle can potentially allocate a cache line. 
Potential allocate cycles are cycles which are 82495XP misses with PCD, 
PWT inactive. 


RDYSRC 





Ready Source 

This signal is an output of the 82495XP. It indicates the source of the 
BRDY generation for the CPU. When high it indicates that the memory bus 
controller should generate BRDYs to the CPU, when low it indicates that 
the 82495XP will be the one providing BRDYs. 
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1.2 Quick Pin Reference (Continued) 



RESET 


I 


Reset 

This signal forces the 82495XP and 82490XP to begin execution at a known state. It's 

falling edge will sample the state of the configuration pins. RESET is an asynchronous 

input to the 82495XP and 82490XP. 

The following 82495XP pins are sampled during reset falling edge: 

CNA# [CFGO]: CFGO line of 82495XP configuration inputs. 

SWEND# [CFG1]: CFG1 line of 82495XP configuration inputs. 

KWEND# [CFG2]: CFG2 line of 82495XP configuration inputs. 

FLUSH # [NCPFLD#]: Enables decoding of the non-cacheable PFLD mode. Active if low. 

FPFLD# [FPFLDEN]: Enables the external FIFO pfld mode. Active high. 

BGT# [C490LDRV]: Indicates the driving strength of the 82495XP/82490XP interface. If 

high, the 82495XP can drive up to 1 82490XP's without derating. If low, the 82495XP 

can drive up to 1 8 82490XP's without derating. 

SYNC# [MEMLDRV]: Indicates the 82495XP's memory bus driving strength. 

SNPCLK[SNPMD]: Indicates the snoop mode, synchronous or asynchronous. 

CFG0-CFG2 signals are used to configure the 82495XP/82490XP with cache 

parameters. They define the lines/sector, line ratio, and number of tags. 

MALE[WWOR#]: Enforces strong or weak write-ordering consistency. 

MBALE[HIGHZ#]: If active along with SLFTST# will tristate all 82495XP outputs. 

The following 82490XP pins are sampled during reset falling edge: 

PAR#: If active(low), this pin configures the 82490XP as a parity storage device. The 

parity configuration stores the paritybits belonging to data stored in other 82490XP's. 

MZBT# [MX4/MX8#]: Determines the number of IO pins used for the memory bus 

interface. If high, four IO pins are chosen. If low, eight IO pins are chosen. 

MSEL# [MT4/MT8#]: Determines the number of transfers necessary on the memory bus 

for each cache line. If high, four memory bus transfers are needed to fill a cache line. If 

low, eight memory bus tranfers are needed to fill a cache line. 

MCLK[MSTBM#]: If active(low), this pin indicates a strobed memory bus configuration. If 

inactive(high), a clocked memory bus is chosen. 

MFRZ# [MEMLDRV]: Indicates the 82490XP's memory bus driving strength. 


SMLN# 





Same Cache Line 

This signal is an output of the 82495XP. It is used to indicate to the memory bus controller 
that the current cycle is to the same 82495XP line as the previous one. This indication 
can be used by the memory bus controller to selectively activate its SNPSTB# signal to 
other caches. For example, back to back snoop hits to the same line may be snooped 
only once. This signal is valid together with CADS # . 
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1.2 Quick Pin Reference (Continued) 



SNPADS# 





Cache Snoop Address Strobe 

This signal is an output of the 82495XP. It has an identical functionality as 
CADS#, but is generated only on snooping-write-back cycles. Considering that 
snoop write-back cycles are the only ones which are generated independent of 
CPU bus activity, this separate address strobe should ease implementation of 
the memory bus controller. Whenever active, the memory bus controller should 
abort all pending cycles (cycles for which BGT# was not issued yet. After 
BGT# the memory bus controller is responsible for the cycle completion). The 
82495XP assumes that non-committed cycles are aborted upon SNPADS# 
and may re-issue them again after the completion of the snoop. 


SNPBSY# 





Snoop Busy 

This signal is driven by the 82495XP. When inactive(high), it indicates that the 

82495XP is ready to accept another snoop cycle. SNPBSY# will be activated 

for one of two reasons: A snoop hit to a modified line, a back-invalidation is 

needed when there is one already in progress. In either of these cases, the 

82495XP will not perform the look-up for a pending snoop until SNPBSY# is 

de-activated. 


SNPCLK[SNPMD] 


I 


Snoop Clock [Snoop Mode] 

This pin provides the 82495XP with the snoop clock to be used in clocked 
memory interfaces. During clocked mode SNPSTB#, SNPINV, SNPNCA, 
MBAOE#, MAOE#> and the Address lines will be sampled by SNPCLK. 
During RESET activation, this pin functions as the SNPMD (snoop mode) 
signal. If high it indicates strobed snooping mode. If low it indicates 
synchronous snooping mode. For clocked snooping mode, SNPCLK is 
connected to the snoop clock source. 


SNPCYC# 





Snoop Cycle 

This signal is an output of the 82495XP. It indicates when the snooping look-up 

is actually taking place in the 82495XP tag RAM. 


SNPINV 


I 


Snoop Invalidation 

This signal is an input to the 82495XP and indicates the resulting line state in 
case of a snoop hit cycle. If active, it forces the line to go to an invalid state. 
This signal is sampled with SNPSTB#. 


SNPNCA 


I 


Snoop Non Caching Device Access 

This signal is an input to the 82495XP and provides the 82495XP information 
on whether the current memory bus master is a non caching device (DMA, 
etc). This indication allows the 82495XP to avoid changing line states from 
exclusive to shared unnecessarily. 


SNPSTB# 


I 


Snoop Strobe 

This signal is an input to the 82495XP which is used to initiate a snoop. 
SNPSTB# causes the latching of the snoop address and parameters. The 
82495XP supports three latching modes: Clocked, Strobed, Synchronous. In 
the clocked mode, address and attribute signals will be latched with the 
activation of SNPSTB#. SNPCLK. In the strobed mode, address and attributes 
will be latched by the SNPSTB# falling edge. In synchronous mode, address 
and attribute signals will be latched with the activation of SNPSTB#.CLK. 
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1.2 Quick Pin Reference (Continued) 



SWEND#[CFG1] 


I 


Snoop Window End, [Configuration Pin 1] 

This signal is generated by the MBC and indicates to the 82495XP that the 
Snoop Window has expired. At this point the 82495XP will latch the memory 
bus attributes: write policy (MWB/WT#), and direct to [M] transfer 
(DRCTM#). At the end of the snooping window, all other devices have 
snooped the bus master's address and have generated address caching 
attributes on the bus. Once a cycle begins, the 82495XP prevents snooping 
until it has received SWEND#. The 82495XP will act based on those 
attributes and will update its tag RAM. 

During RESET's falling edge this line functions as the CFG1 configuration 
signal which is used to configure the 82495XP/82490XP with cache 
parameters. 


SYNC#[MEMLDRV] 


I 


Synchronize 82495XP cache, [Memory Bus Low Drive] 

This signal is an input to the 82495XP. Activation of this line will cause the 

synchronization of the 82495XP tag array with main memory. All 82495XP 

modified lines will be written back to main memory. The difference between 

FLUSH and SYNC is that on SYNC the 82495XP and CPU tag array will NOT 

be invalidated. All the valid entries will be kept, with all modified lines 

(M state) becoming non-modified (E state). 

During RESET's falling edge, this signal is sampled to indicate the memory 

bus driving strength. If it is sampled low, the maximum capacitive load 

without derating is 100pf. If it is sampled high, the maximum capacitive load 

without derating is 50pf. 


TCK 


I 


Testability Clock 

This signal is an input to both the 82495XP and 82490XP. This is the 
boundary scan clock. This signal has to be connected to a clock 
synchronous to CLK to insure initialization of the test logic. 


TDI 


I 


Testability serial input 

This signal is an input to both the 82495XP and 82490XP. 


TDO 





Testability serial output 

This signal is an output of both the 82495XP and 82490XP. 


TMS 


I 


Testability Control 

This signal is an input to both the 82495XP and 82490XP. 




The following pins have internal pull-ups: 

ADS#, NA#, FPFLD#, TDI, TMS, BGT#, 
KWEND#, SWEND#, CNA#, BRDY#, SYNC#, 
FLUSH #, SNPSTB#, MRO#, DRCTM#, TCK, 
SNPCLK, MFRZ#, MZBT#, MCLK, MOCLK. 



During tri-state output testing sequence, all pull-ups 
will be disabled. 

The following signals are glitch free. These signals 
are always at a valid logic level following RESET: 

CADS#, CDTS#, SNPADS#, SNPCYC#. 
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1.3 Output Pins 

Table 1-3 lists all output pins, from which part(s) they are driven, and their active levels. 

Table 1-3. Output Pins 



Name 


Part 


Active Level 


Name 


Part 


Active Level 


BLE# 


82495XP 


LOW 


MTHIT# 


82495XP 


LOW 


CADS# 


82495XP 


LOW 


NENE# 


82495XP 


LOW 


CAHOLD 


82495XP 


HIGH 


PALLC# 


82495XP 


LOW 


CDTS# 


82495XP 


LOW 


RDYSRC 


82495XP 


HIGH 


CWAY 


82495XP 


- 


SMLN# 


82495XP 


LOW 


CW/R#,CD/C#,CM/IO# 


82495XP 


- 


SNPADS# 


82495XP 


LOW 


FSIOUT# 


82495XP 


LOW 


SNPBSY# 


82495XP 


LOW 


KLOCK# 


82495XP 


LOW 


SNPCYC# 


82495XP 


LOW 


MCACHE# 


82495XP 


LOW 


TDO 


82495XP/82490XP 


- 


MHITM# 


82495XP 


LOW 









1.4 Input Pins 

Table 1-4 lists all input pins, which part(s) they are input to, their active level, and whether they are synchro- 
nous or asynchronous inputs. 





Table 1-4. Input Pins 




Name 


Part 


Active Level 


Synchronous/ Asynchronous 


BGT#[C490LDRV] 


82495XP 


LOW 


Synchronous to CLK 


BRDY# 


82495XP/82490XP 


LOW 


Synchronous to CLK 


CLK 


82495XP/82490XP 


- 


- 


CFG3 


82495XP 


- 


Synchronous to CLK 


CNA#(CFG0) 


82495XP 


LOW 


Synchronous to CLK 


CRDY#[SLFTST#] 


82495XP/82490XP 


LOW 


Synchronous to CLK 


DRCTM# 


82495XP 


LOW 


Note 2 


' FLUSH #' [NCPFLD#] 


82495XP 


LOW 


Asynchronous 


CPUTYP 


82495XP 


LOW 


Synchronous to CLK 


KWEND# (CFG2) 


82495XP 


LOW 


Synchronous to CLK 


MALE.MBALE 


82495XP 


HIGH 


Asynchronous 


MAOE#,MBAOE# 


82495XP 


LOW 


Asynchronous 


MCLK[MSTBM#] 


82490XP 


LOW 


Synchronouos to MCLK 


MBRDY# (MISTB) 


82490XP 




- 


MDOE# 


82490XP 


LOW 


Asynchronous 


MEOC# 


82490XP 


LOW 


Synchronous/ Asynchronous, Note 1 
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Table 1-4. Input Pins (Continued) 


Name 


Part 


Active Level 


Synchronous/ Asynchronous 


MFRZ# 


82490XP 


Low 


Synchronous/Asynchronous, Note 1 


MOCLK(MOSTB) 


82490XP 






MSEL[MTR4/TR8#] 


82490XP 


Low 


Synchronous/Asynchronous, Note 1 


MZBT#[MX4/MX8#] 


82490XP 


Low 


Synchronous/ Asynchronous, Note 1 


MKEN# 


82495XP 


LOW 


Note 2 


MRO# 


82495XP 


LOW 


Note 2 


MWB/WT# 


82495XP 


- 


Note 2 


PAR# 


82490XP 


Low 


Synchronous to CLK 


RESET 


82495XP/82490XP 


HIGH 


Asynchronous 


SNPCLK[SNPMD] 


82495XP 


- 


- 


SNPINV 


82495XP 


HIGH 


Note 3 


SNPNCA 


82495XP 


HIGH 


Note 3 


SNPSTB# 


82495XP 


LOW 


Note 3 


SWEND# (CFG1) 


82495XP 


LOW 


Synchronous to CLK 


SYNC#[MEMLDRV] 


82495XP 


LOW 


Asynchronous 


TCK 


82495XP/82490XP 


- 




TDI 


82495XP/82490XP 


- 


Synchronous to TCK 


TMS 


82495XP/82490XP 


- 


Synchronous to TCK 




NOTES: 

(1) In Clocked memory bus mode these pins are synchronous with MCLK. In Strobed memory bus mode these pins are 
asynchronous. 

(2) MWB/WT#, DRCTM# must be synchronous to CLK during SWEND#. MKEN#,.MRO# must be synchronous to CLK 
during KWEND#. 

(3) In clocked memory bus mode these pins are synchronous with SNPCLK. In strobed memory mode these pins are 
asynchronous. 

1.5 Input/Output Pins 

Table 1-5 lists all input/output pins, which part they interface with, and when they are floated. 

Table 1-5. Input/Output Pins 



Name 


Part 


Synch/Asynch 


When Floated 


FPFLD#[FPFLDEN] 


82495XP 


Synchronous to CLK 


- 


MCFA0-MCFA6 


82495XP 


Note 1 


MAQE# = High 


MDATA0-MDATA7 


82490XP 


Note 2 


MDOE# = Hight and during Reset 


MSET0-MSET10 


82495XP 


Notel 


MAOE# = High 


MTAG0-MTAG11 


82495XP 


Notel 


MAOE# = High 



NOTES: 

(1) With MALE high and MAOE# low, these pins are synchronous to CLK. 

(2) In Clocked memory bus mode these pins are synchronous with MCLK. In Strobed memory bus mode these pins are 
asynchronous. 
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1.6 Pin State During Reset 



Table 1-6. Pin State During Reset 




Pin Name 


Pin State during Reset 


CADS#,CDTS#,SNPADS# 


High 


CW/R#, CD/C#, CM/IO#, MCACHE# 


Undefined 


RDYSRC, PALLC#, CWAY 


Undefined 


NENE#,SMLN# 


Undefined 


KLOCK# 


High 


FPFLD# 


High 


MSET0-MSET10, MTAG0-MTAG11, MCFA0-MCFA6 


Note! 


CAHOLD 


Note 2 


MHITM#,MTHIT# 


High 


SNPCYC#,SNPBSY# 


High 


TDO 


Note 3 



NOTES: 

(1)MSET, MTAG, and MCFA signals are high impedance during reset if MAOE# and MBAOE# are deasserted. 

(2) The state of CAHOLD depends on whether self-test is selected (see testability chapter for details). 

(3) The State of TDO is controlled by the boundary scan which is independent of other signals including RESET (see 
testability chapter for details). 



2.0 CHIPSET INTRODUCTION 

The 82495XP/82490XP is a second-level cache 
controller chipset for the i860 XP CPU. The chipset 
provides a unified code and data cache which is 
software transparent. The 82495XP/82490XP has 
been designed to support a high-speed CPU/cache 
core interface, and a same or lower speed memory 
bus interface. 

The 82495XP is the cache controller. It contains 8K 
tags and control logic to control up to a 51 2K size 
cache. The 82490XP is a custom cache data RAM 
designed to be used with the 82495XP. Between 8 
and 18 82490XPs are required to create a 256K to 
51 2K cache, respectively. The memory bus control- 
ler (MBC) is the set of logic required to interface the 
82495XP and 82490XP to the memory bus. The 
MBC provides product differentiation, and its imple- 
mentation ultimately determines system perform- 
ance. 



2.1 Main Features 

The 82495XP/82490XP have the following main 
features: 

— Tracks the speed of the i860 XP CPU 

— Large Cache Size support: 
4K or 8K Tags 

1 or 2 lines per sector 
4 or 8 transactions per line 
64 or 128-bit wide memory bus 
256Kor512Kcache 

— Write-Back cache with full multiprocessing con- 
sistency support: 

supports the M ESI protocol 

watches memory bus to guarantee 1 st level, 2nd 
level cache consistency 

maintains inclusion 

— Two-way set-associative with MRU hit prediction 
algorithm 
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— Zero wait state hit cycles on MRU hit. One wait 
state on MRU misses 

— Concurrent CPU and Memory Bus transactions 

— Supports synchronous, asynchronous, and 
strobed memory bus architectures 



2.2 CPU/Cache Core Description 

Figure 2-1 depicts a block diagram of the basic 
cache subsystem. The cache subsystem provides a 
gateway between the CPU and the memory bus. All 
CPU accesses which can be serviced locally by the 
cache subsystem will be filtered out from the memo- 
ry bus traffic. Therefore local cycles (CPU cycles 
which hit the cache and do not require a memory 
bus cycle) will be completely invisible to the memory 
bus providing the reduction in memory bus band- 
width necessary for multiprocessing systems. Anoth- 
er very important function of the 82495XP cache 
subsystem is to provide speed decoupling between 
the CPU and memory busses. Processors are quick- 
ly achieving operating frequencies which can be 
very difficult for the memory subsystem to meet. The 
82495XP cache subsystem is optimized to serve the 
CPU with zero wait-states up to very high frequen- 
cies (50 Mhz), at the same time providing the decou- 
pling necessary to run slower memory bus cycles. 

The Basic Functions of the cache subsystem ele- 
ments are: 

82495XP: Main control element, includes the tags 
and line states and provides hit or miss decisions. It 
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Figure 2-1. 82495XP Cache Subsystem 

handles the CPU bus requests completely and coor- 
dinates with the memory bus controller when an ac- 
cess needs the memory bus. It controls the 
82490XP data paths for both hits/misses to provide 
the CPU with the correct data. It dynamically adds 
wait states based on the MRU prediction mecha- 
nism. The 82495XP is also responsible for perform- 
ing memory bus snoop operations while other devic- 
es are using the memory bus. The 82495XP drives 
the cycle address and other attributes during a 
memory bus access. A block diagram of the 
82495XP is shown in Figure 2-2. 
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Figure 2-2. 82495XP Block Diagram 
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82490XP: Implements the cache SRAM storage and 
data path. It includes latches, muxes, logic which 
allow it to work in lock-step with the 82495XP to 
efficiently serve both hit and miss accesses. It takes 
full advantage of internal silicon flexibility to provide 
a degree of performance otherwise unachievable 
with discrete implementations. It supports zero wait 
state hit accesses, concurrent CPU and memory bus 
accesses, and includes a replication of the MRU bits 
for autonomous way prediction. During memory bus 
cycles it acts as a gateway between CPU and mem- 
ory buses. A block diagram of the 82490XP is shown 
in Figure 2-3. 



Memory Bus Controller: Server for memory bus cy- 
cles. It adapts the CPU /Cache core to a specific 
memory bus protocol. It coordinates with the 
82495XP line fills, flushes, write-backs, etc. The 
memory bus controller's flexibility allows customers 
to easily adapt the 82495XP cache subsystem to 
their specific architectures, and to provide their own 
differentiation. Figure 2-4 shows an example memo- 
ry bus controller. The MBC handles all cycle control, 
data transferring, snooping, and any synchroniza- 
tion. 
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Figure 2-3. 82490XP Block Diagram 
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3.0 CACHE OVERVIEW 

This chapter gives a brief description of 82495XP/ 
82490XP configurations, interface, snooping mecha- 
nism, cycle control mechanism, and memory bus 
control mechanism. Each section of this overview is 
described in more detail in later chapters. 



3.1 Configuration 

The 82495XP/82490XP cache chipset offers a num- 
ber of configuration options. The system designer 
can choose from a number of different operating 
characteristics, including memory bus modes, 
snooping modes, and internal physical attributes 
(line size, lines per sector, etc.). The flexibility of 
these configuration options allow the 82495XP/ 
82490XP cache to be used in a wide range of appli- 
cations. 

Configurations are selected by altering the 
82495XP/82490XP inputs during RESET. They are 
not dynamically changeable, and to conserve pins 
some configuration inputs become 82495XP or 
82490XP inputs/outputs after RESET. 



3.1.1 PHYSICAL CACHE 

Physically, the 82495XP/82490XP can be config- 
ured to support many different cache configurations. 
By selecting one cache configuration, other configu- 
rations may be excluded. The 82495XP/82490XP 
can be configured to support: 

— 256Kor512Kcache 

— 64 or 128 bit wide memory bus 

— One or two lines per sector 



— 1:1, 1:2, or 1:4 CPU to 82495XP line size ratio 

— 4 or 8 memory bus transactions per line 

— 4K or 8K tag size 

— Strong or weak write ordering 

Figure 3-1 summarizes the basic configurations 
available when using the 82495XP/82490XP. 

3.1.2 SNOOP MODES 

When another master snoops the 82495XP, the 
MBC must initiate the snoop request and pass on 
the response. The 82495XP allows the MBC to initi- 
ate this snoop request in one of three modes: syn- 
chronous, clocked, and strobed. The snoop re- 
sponse of the 82495XP is always synchronous. 

When initiating the snoop in synchronous snoop 
mode, all snoop information is latched by the 
82495XP synchronous to the CPU CLK. The snoop 
is then performed on the next CLK edge and the 
response given on the CLK edge after that. This is 
the fastest possible method of snooping. 

In clocked snooping mode, information is latched by 
the 82495XP with respect to an external snoop 
clock (slower than CLK) source. The 82495XP must 
internally synchronize this information to CLK and 
provide a response. 

In strobed snooping mode, information is latched 
into the 82495XP with respect to the falling edge of 
another signal. Thus, the snoop initiation is clock in- 
dependent. The 82495XP again synchronizes this in- 
formation with CLK. 
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Figure 3-1. 82495XP/82490XP Configurations 
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3.1:3 MEMORY BUS MODES 

The 82490XP may be configured to be In one of two 
memory bus modes. This mode determines how 
data will be passed on to and off of the data bus. 
The two modes are clocked mode and strobed 
mode. These modes need not have any relation to 
the snoop mode chosen. 

In clocked mode, data is driven from an external 
memory clock source called MCLK, or read with re- 
spect to MCLK. MCLK is completely independent of 
the CPU CLK source. There are inherent perform- 
ance advantages, however, in , making this clock 
source synchronous or half-clock (divided) synchro- 
nous to the CPU CLK. 

In strobed mode, data is driven from the rising edge 
of one signal, and read with the rising edge of anoth- 
er. Like the strobed snooping mode, this carries no 
clock skew problems, or memory bus speed limita- 
tions. 



3.2 CPU Bus Interface 

The CPU bus interface is the connection of the 
82495XP and 82490XP to the i860 XP CPU. Be- 
cause this interface is optimized to achieve the high 
speed performance, it is not a flexible interface. The 
majority of the signals in the CPU bus interface must 
be connected strictly between the 82495XP/ 
82490XP cache and the i860 XP CPU. Chapter 10 
addresses the use of such signals. 

Some CPU signals are, however, accessible by the 
MBC. These are the following pins: RESET, CLK, 
BRDY2#, INT, BERR, PCHK#, PEN#, TCK, TDI, 
TMS, TRST#, and TDO. CPU pins KBO, KB1, HIT#, 
and BREQ are also available to the MBC, but are of 
limited use in an 82495XP/82490XP system. 

Other CPU pins flow through a '377 type latch to the 
MBC. The latch enable is controlled by the 82495XP 
through the BLE# pin. The following CPU signals 
flow through this latch: PCD, PWT, BE0#-BE7#, 
CACHE #, LEN, PCYC, and CTYP. 



3.3 82495XP/82490XP Interface 

The 82495XP/82490XP interface is the connection 
between 82495XP and 82490XP. Like the CPU bus 
interface, this isolated interface is not flexible and 
may not be altered beyond what Intel has provided. 



3.4 Memory Bus and Memory Bus 
Controller Interface 

The memory bus controller (MBC) is the interface 
logic required to control the 82495XP/82490XP and 
connect it to the memory bus and rest of the system. 
The MBC may be simple enough to support a single- 
CPU write-through cache, or complex enough to 
support a multiprocessing cache with external tags. 
The 82495XP/82490XP is a very flexible chipset, 
and the MBC determines exactly how the 
82495XP/82490XP will work in a system. 

An MBC consists of a few basic blocks: a snoop 
logic block, a cycle control block (with synchronizers 
if necessary), and data path control block. The 
snoop block must be able to communicate with the 
other caches when snooping is necessary. At the 
same time, the cycle control block must interface to 
some arbitration logic for bus arbitration. 

3.4.1 SNOOPING LOGIC 

The MBC snooping logic is responsible for initiating 
a snoop in the 82495XP and providing the response 
to the rest of the system. Snoop logic must recog- 
nize what other caches are doing, and snoop if nec- 
essary. Snoop logic must also recognize when its 
82495XP is not capable of snooping and delay its 
snoop initiation. 

When a cycle begins on the bus, all other caches 
snoop. Once all the snoop results are returned to 
the master 82495XP, its snoop logic must recognize 
the result and alter the cycle appropriately. This 
could mean aborting the current cycle in memory, 
delaying the cycle until a write-back is performed, or 
changing the master's tag state according to the 
snoop information. 

3.4.2 CYCLE CONTROL LOGIC 

Cycle control logic is responsible for initiating a 
memory bus cycle, providing proper 82495XP cycle 
attributes during the cycle, and terminating the cy- 
cle. Cycle control logic determines the cacheability 
of the cycle, whether cycles are allocatable, pipelin- 
ing, and all aspects of the progress of the current 
cycle. 

Since cycle control logic interfaces memory bus sig- 
nals to the 82495XP, and since the memory bus is 
not necessarily synchronous to the 82495XP CLK, it 
may also provide proper synchronization. Careful 
design of this synchronization logic can minimize or 
eliminate synchronization penalties. 
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3.4.3 DATA PATH CONTROL 

Data path control logic controls how data is written 
from the 82490XP or read into the 82490XP and 
CPU. It handles the actual transferring of data to/ 
from the memory data bus. Data path control logic 
also handles the CPU burst order, and the holding of 
data during allocation cycles. In systems with memo- 
ry busses that are wider than the CPU bus, the data 
path control logic appropriately steers data to the 
correct 82490XP's. 



3.5 Test 

The 82495XP/82490XP provide two means of 
cache testing. These are a built-in self-test, and 
boundary scan test. The built-in self-test (BIST) is 
initiated during RESET. The boundary scan test 
uses separate and dedicated pins on the 82495XP. 
These are described in a later chapter. 



4.0 CACHE CONSISTENCY 

PROTOCOL 

One of the 82495XP objectives is to implement a 
high performance second level cache for multipro- 
cessor systems. To fulfill this objective the 82495XP 
implements a "write-back" cache with full support 
for multiprocessing data consistency. Being a write- 
back cache means that the 82495XP may contain 
data which is not updated in the main memory. 
Therefore a mechanism is implemented to insure 
that data read by any system bus master, at any 
time, is correct. 

A key feature for multiprocessing systems is reduc- 
tion of the memory bus utilization. The memory bus 
quickly becomes a resource bottleneck with the ad- 
dition of multiple processors. The 82495XP cache 
consistency mechanism insures minimal usage of 
memory bus bandwidth. 

The 82495XP allows portions of memory to be de- 
fined as non-cacheable. For the cacheable areas, 
the 82495XP allows selected portions to be defined 
as write-through locations. 

The 82495XP protocol is implemented by assigning 
state bits for each cached line. Those states are de- 
pendent on both 82495XP data transfer activities 
performed as the bus master, and snooping activi- 
ties performed in response to snoop requests gener- 
ated by other memory bus masters. 



4.1 Cache Consistency Protocol 
Model 

The 82495XP consistency protocol is the set of rules 
which allows the 82495XP to contain data that is not 
updated in main memory while ensuring that memo- 
ry accesses by other devices do not receive stale 
data. This consistency is accomplished by assigning 
a special consistency state to every cached entry 
(line) in the 82495XP. 

NOTE: 

The following rules apply to memory read and write 
cycles. All I/O and special cycles bypass the 
cache. 

The 82495XP protocol consists of 4 states. They de- 
fine whether a line is valid (hit or miss), if it is avail- 
able in other caches (shared or exclusive), and if it is 
modified (has been modified). 

The 4 States are: 

[I] - INVALID Indicates that the line is not avail- 

able in the cache. A read to this 
line will be a miss and cause the 
82495XP to execute a lino fill 
(fetch the wholo lino and doposit 
it into the cacho SRAM). A write 
to this line will cause the 
82495XP to execute a write- 
through cycle to the memory bus 
and in some circumstances initi- 
ate an ALLOCATION. 

[S] - SHARED This state indicates that this line 
is potentially shared with other 
caches (The same line may exist 
in more than one cache). A 
Shared line can be read out of the 
cache SRAM without a main 
memory access. Writing to a 
Shared line updates the 
82495XP/82490XP cache, but 
also requires the 82495XP to 
generate a write-through cycle to 
the memory bus. In addition to 
updating main memory, the write- 
through cycle will invalidate this 
line in other caches. Since writing 
to a Shared line causes a write- 
through cycle, the system can en- 
force a "write-through policy" to 
selected addresses by forcing 
those addresses into the [S] 
state. This can be done by setting 
the PWT attribute in the CPU 
page table or asserting the 
MWB/WT# pin each time the ad- 
dress is referenced. 
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[E] - EXCLUSIVE This state indicates a line which is 
exclusively available in ONLY this 
cache, and that this line is NOT 
MODIFIED (main memory also 
has a valid copy). Writing to an 
Exlusive line causes it to change 
to the Modified state and can be 
done without informing other 
caches, so no memory bus activi- 
ty is generated. 



[M] - MODIFIED 



This state indicates a line which is 
exclusively available in ONLY this 
cache, and is MODIFIED (main 
memory's copy is stale). A 
Modified line can be updated lo- 
cally in the cache without acquir- 
ing the memory bus. Because a 
Modified line is the only up-to- 
date copy of data, it is the 
82495XP's responsibility to flush 
this data to memory on accesses 
to it. Flushing of this data to mem- 
ory will be executed immediately 
after completion of the current 
CPU bus cycle. 



4.2 Basic State Transitions 

This section covers the most common, basic memo- 
ry accesses. The special functions which force a cy- 
cle to be noncacheable, locked, read only, or direct- 
to-Modified are not in use. These might be used, for 
example, in read for ownership and cache to cache 
transfers, and are covered in section 4.3. This basic 
transitions section is divided into two parts: the first 
covers MESI state changes which occur in a CPU/ 
cache core due to its own actions; the second de- 
scribes MESI state transitions in a CPU/cache core 
caused by the actions of other, external devices. 
Figure 4-1 shows a partial state diagram of the MESI 
coherency protocol which includes these basic tran- 
sitions. 

The 82495XP accepts line attributes from the CPU 
and memory buses. The 82495XP assumes that all 
caches on the memory bus have the SAME number 
of bytes per line. 



4.2.1 



TRANSITIONS IN CACHE STATES 
CAUSED BY OWN CPU TRANSACTIONS 



The MESI state of each 82495XP/82490XP cache 
line changes as the 82495XP/82490XP services the 
read and write requests generated by its CPU. 



4.2.1.1 Read Hit 

A read hit occurs when the CPU generates a read 
cycle on its bus, and the data is present in and re- 
turned by the 82495XP/82490XP. The state of the 
cache line (M, E, or S) remains unchanged by a read 
operation which hits the cache. 

4.2.1.2 Read Miss 

A read miss arises when the CPU generates a read, 
and the data is not present in the 
82495XP/82490XP cache— either the tag lookup 
does not produce a match or a match occurs but the 
data is Invalid. The 82495XP generates a memory 
access to fetch the data (which is assumed cache- 
able for this discussion) and the surrounding data 
needed to fill the cache line. This data is placed in 
the 82495XP/82490XP cache in an invalid line or (if 
both valid) replaces the least recently used line, 
which is written back to memory if Modified. 

The new line is placed in the Exclusive state, unless 
either the CPU or memory indicates that it should be 
a write-through on its next write access using PWT 
or MWB/WT#, respectively. If either of these is as- 
serted, the new line is placed in Shared state. A new 
line could also be read in and placed directly into 
Modified state: see section 4.3.4 for details and use. 



4.2.1.3 Write Hit 

When the CPU generates a write cycle, if the data is 
present in the 82495XP/82490XP cache, it is updat- 
ed and may undergo a MESI state change. 

If the hit line is originally in the Exclusive state, it 
changes to Modified state upon a write. If the hit line 
is originally in the Modified state, it remains in that 
state. Neither of these cases generates any bus ac- 
tivity. 

A write to a line which is in the Shared state causes 
the 82495XP to write the data out to memory as well 
as update the 82495XP/82490XP cache. The write 
to main memory also serves to invalidate any copy 
of the data which resides in another cache. The 
cache line state changes according to activity on the 
PWT and MWB/WT# pins. If neither of these pins is 
asserted, the write hit line becomes Exclusive. If ei- 
ther of these pins is asserted, the line is forced to 
remain write-through, so the state remains Shared. 
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An existing line can also be written and forced di- 
rectly into Modified state: see section 4.3.4 for de- 
tails and use. 



4.2.1.4 Write Miss 

The CPU generates a write cycle, and the data is not 
present in the 82495XP/82490XP cache. In a simple 
write miss, the 82495XP/82490XP assists CPU in 
delivering data to memory, but the data is not placed 
in the cache. No cache lines are affected, so no 
state changes take place. 



4.2.1.5 Write Miss with Allocate 

This is a special case of a write miss where the 
memory location written by the CPU is not currently 
in the 82495XP/82490XP cache, but is brought into 
the cache and updated. Like a regular write miss, 
the 82495XP/82490XP assists the CPU in writing 
the data out to main memory. After the data is writ- 
ten to memory, the 82495XP/82490XP reads back 
the same data following the rules of a read miss, 
above. 

The ability to perform an allocation depends on all of 
the following conditions: 

the write is cacheable 

PWT is not asserted, forcing write-through 

the write is not LOCKed 

the write is to memory (not to I/O) 

4.2.2 TRANSITIONS CAUSED BY OTHER 
DEVICES ON BUS 

MESI state transitions in the 82495XP/82490XP 
cache of one core (CPU/82495XP/82490XP) can 
be induced by actions initiated by other cores or de- 
vices on the shared memory bus. In the following, 
the 82495XP which is responding to actions of other 
devices does not currently own the bus, and may be 
referred to as a "slave" or, in the case of snooping, 
a "snooper". The device which currently owns the 
bus is the "master". 



4.2.2.1 Snooping 

The master which is accessing data from memory 
on the bus sends a request to all caching devices on 
the bus (snoopers) that they check or snoop their 
caches for a more recently updated version of the 
data being accessed. If one of the snoopers has a 
copy of the requested data, it is termed a "snoop 
hit". 

If a snooper has a modified version of the data 
("snoop hit to a Modified line"), it proceeds to gener- 
ate an "inquire cycle" to the i860 XP CPU, asking 
the i860 XP CPU if it also has a Modified copy of the 
line (which would be more recently modified than the 
82495XP/82490XP's version). The most up-to-date 
line is written out by the snooping 
82495XP/82490XP to the bus (to main memory or 
directly to the requesting master) so that the re- 
questing master can utilize it. 

The changes in MESI protocol state in a snooping 
cache which has a snoop hit depend on attribute 
inputs SNPINV and SNPNCA, which are driven by 
the master. 

The SNPINV input tells a snooping 
82495XP/82490XP to invalidate tho lino boing 
snooped if hit: the master requesting the snoop is 
about to write to its copy of this line and will there- 
fore have the most up-to-date copy. When SNPINV 
is asserted on the snoop request, any snoop hit is 
placed in Invalid state, and a "back invalidation" is 
generated which instructs the CPU to check its 
cache and likewise invalidate a copy of the line. 
When the snooping 82495XP has a snoop hit to a 
Modified line and SNPINV was asserted by the bus 
master, the back invalidate is combined with the in- 
quire cycle. 

The SNPNCA input tells a snooping 
82495XP/82490XP whether the requesting master 
is performing a Non-Caching Access. If the request- 
ing master is not caching the data, a snoop hit to a 
Modified or Exclusive line can be placed in the 
Exclusive state: since the requester isn't caching the 
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line, if the snooper has a future write hit to the line, 
an invalidation does not have to be broadcast. If the 
requesting master is caching the data, then a snoop 
hit to a Modified or Exclusive line must be placed in 
the Shared state, which insures that a future write hit 
causes an invalidation to other caches. Note that a 
snoop hit to a Shared line must remain in the Shared 
state regardless of SNPNCA. Also note that an as- 
serted SNPINV always overrides SNPNCA. 

4.2.2.2 Cache Synchronization 

Cache synchronization is performed to bring the 
main memory up-to-date with respect to the 
82495XP/82490XP. Two devices exist in the 
82495XP/82490XP to accomplish this: FLUSH and 
SYNC. 

A cache flush is initiated by asserting the 82495XP 
FLUSH # pin. Once initiated, the 82495XP writes all 
Modified lines out to main memory, performing back 
invalidations and inquire cycles on the CPU. When 
completed, all 82495XP/82490XP and CPU cache 
entries will be in the Invalid state. 



4.3 The Effects of Special Cycles on 
MESI States 

4.3.1 NON-CACHEABLE ACCESS 

The 82495XP allows cacheability to be determined 
on both a per page and per line basis. The page 
cacheability function is determined by software, 
while cacheability on a line-by-line basis is driven by 
hardware. 

The PCD (Page Caching Disabled) pin is a 82495XP 
input driven by the CPU's PCD output, which corre- 
sponds to a cacheability bit in the page table entry of 
a memory location's virtual address. If the PCD bit is 
asserted when the CPU presents a memory ad- 
dress, that location will not be cached in either the 
82495XP or the CPU. 

MKEN# is a 82495XP input which connects to the 
memory bus controller or the memory bus. MKEN# 
inactive prevents the caching of the memory loca- 
tion in both the 82495XP and the CPU, affecting only 
the current access. 



Activation of the SYNC# pin also causes all of the 
82495XP's Modified lines to be written to memory. 
Unlike the FLUSH # pin, the cache lines remain valid 
after the SYNCH # process has completed, with 
Modified lines changing to the Exclusive state. 



If a read miss is indicated non-cacheable by either of 
these, the line is not placed in the 
82495XP/82490XP or CPU cache, and no cache 
states are modified. On a write miss, a noncachea- 
ble indication from either input forces a write miss 



SNOOP«INV + FLUSH 



READ HIT 



SNOOPVlNV ' 




SNOOP»INV»NCA + SYNC 



WRITE HIT 
(GENERATES 
WRITE TO BUS) 



w 

SNOOP* INVoNCA 



Figure 4-1. Major State Transitions 
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without allocation. Note that if the 82495XP/ 
82490XP already has a valid copy of the line, the 
PCD attribute from the CPU is ignored. 

4.3.2 READ ONLY ACCESSES: MRO# 

The MRO# (Memory Read Only) input is driven by 
the memory bus to indicate that a memory location 
is read only. 

When asserted during a read miss line fill, MRO# 
causes the line to be placed in the 
82495XP/82490XP cache in the Shared state and 
also sets a read-only bit in the cache tag. MRO# 
accesses are not cached in the CPU. On subse- 
quent write hits to a read-only line, the write is actu- 
ally written through to memory without updating the 
82495XP/82490XP line, which remains in the 
Shared state with the read-only bit set. 

4.3.3 LOCKED ACCESSES: LOCK# 

The LOCK# signal driven by the CPU indicates that 
the requested cycle should lock the memory loca- 
tion for an atomic memory access. Because locked 
cycles are used for interprocessor and intertask syn- 
chronization, all locked cycles will appear on the 
memory bus. 

On a locked write, the 82495XP treats the access as 
a write-through cycle, sending the data to the memo- 
ry bus — updating memory and invalidating other 
cached copies. If the data is also present in the 
82495XP/82490XP cache, it is updated but its M, E, 
or S state remains unchanged. 

For locked reads, the 82495XP assumes a cache 
miss and starts a memory read cycle. If the data 
resides in the 82495XP/82490XP, the M-E-S state 
of the data remains unchanged. If the requested 
data is in the 82495XP/82490XP and is in the 
Modified state when the memory bus returns data, 
the 82495XP will use the 82490XP data and ignore 
the memory bus data. 

LOCKed read and write cycles which miss the 
82495XP/82490XP cache are noncacheable in both 
the 82495XP/82490XP and CPU. 



4.3.4 



FORCING LINES DIRECT-TO-MODIFIED: 
DRCTM# 



The DRCTM# (Direct To Modified) pin is an input 
which informs the 82495XP to skip the Exclusive 
state and place a line directly in the Modified state. 
The signal can be asserted during 
82495XP/82490XP reads of the memory for special 
82495XP/82490XP data accesses like read-for- 
ownership and cache-to-cache-transfer. The signal 
can also be asserted during writes, for purposes of 
cache tracking. 



4.4 StafeTables 

Lines cached by the 82495XP can change states as 
a result of either the CPU bus activity (that some- 
times require the 82495XP to become a memory bus 
master) or as a result of memory bus activity gener- 
ated by other system masters (snooping). 

State transitions are affected by the type of CPU/ 
memory bus transactions (reads, writes) and by a 
set of external input signals and internally generated 
variables. In addition, the 82495XP will drive certain 
CPU/memory bus signals as a result of the consist- 
ency protocol. 



4.4.1 CPU BUS 

— PWT (Page Write Through, PWT Input pin) Indi- 
cates a CPU bus write-through request. Activat- 
ed by the i860 XP CPU PWT pin. This signal af- 
fects line fills and will cause a line to be put in the 
[S] state if active. The 82495XP will NOT exe- 
cute ALLOCATIONS (line fills triggered by a 
write) for write-through lines. If PWT is asserted, 
it overrides a write-back indication on the 
MWB/WT# pin. 

— PCD (Page Cacheability Disable, PCD input pin): 
indicates that the accessed line is noncachea- 
ble. If PCD is asserted, it overrides a cacheable 
indication from an asserted MKEN#. 

— NWT (i860 XP CPU Write-Through Indication, 
82495XP's WB/WT# Output Pin): When low 
forces the i860 XP CPU to keep the accessed 
line into the SHARED state. 
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Write back mode, (WB = 1 ) will be indicated by 
the INWT notation. In those cases the i860 XP 
CPU is allowed to go into exclusive states [E], 
[M]. NWT\s normally active unless explicitly stat- 
ed. 

— KEN (CPU caching enable, KEN # output pin): 
When active indicates that the requested line 
can be cached by the CPU 1st level cache. KEN 
is normally active unless explicitly stated. 

4.4.2 MEMORY BUS 

— MWT (Memory Bus Write-Through Indication, 
MWB/WT# Input Pin): When active forces the 
82495XP to keep the accessed line into the 
SHARED state. Write back mode (MWB = 1) will 
be indicated by the !MWT notation. In those cas- 
es the 82495XP is allowed to go into exclusive 
states [E], [M]. 

— DRCTM (Memory Bus Direct To [M] indication, 
DRCTM# Input Pin): When active forces skip- 
ping of the [E] state and direct transfer to [M]. 

— MKEN (Memory Bus Cacheability Enable, 
MKEN# Input pin): When Active Indicates that 
the memory bus cycle is cacheable. 

— MRO (Memory Bus . Read-Only Indication, 
MRO# Input Pin): When Active forces line to be 
READ-ONLY. 

— MTHIT (Tag Hit, MTHIT# Output pin): Activated 
by the 82495XP during snoop cycles and indi- 
cates that ,the current snooped address hits the 
82495XP cache. 

-— MHITM (Hit to a line in the [M] State, MHITM# 
Output pin): Activated by the 82495XP during 
snoop cycles and indicates that the current 
snooped address hits a modified line in the 
82495XP cache. 

— SNPNCA (Non Caching device access): When 
active indicates to the 82495XP that the current 
bus master is a non-caching device. 

— SNPINV (Invalidation): When active indicates to 
the 82495XP that the current snoop cycle will 
invalidate that address. 

4.4.3 TAG STATE 

— TRO (Tag Read Only, 82495XP Tag bit): This bit 
when set indicates that the 1 or 2 lines associat- 
ed with this tag are Read-Only lines. 



As a function of State Changes the Q2495XP 
may execute the following cycles: 

— BINV: Execution of a CPU Back Invalidation Cy- 
cle (Snoop with INV active) 

— INQR: Execution of a i860 XP CPU Inquire 
CycleO). 

— WBCK: 82495XP Write-Back Cycle. This is a 
Memory Bus write cycle generated by the 
82495XP when MODIFIED data cached in the 
82495XP needs to be copied back into main 
memory. A write-back cycle affects a complete 
82495XP line. 

— WTHR: 82495XP Write Through Cycle. This is a 
system write cycle in response to a processor 
write. It may or may not affect the cache SRAM 
(update). In a write-through cycle, the 82495XP 
drives the Memory Bus with the same Address, 
Data and Control signals as the CPU does on the 
CPU Bus. Main Memory is updated, and other 
Caches invalidate their copies. 

— RTHR: 82495XP Read Through cycle. This is a 
special cycle to support locked reads to lines 
that hit the 82495XP cache. The 82495XP will 
request a Memory Bus cycle for lock synchroni- 
zation reasons, data will be supplied from the 
BUS except for [M] state which will have data 
supplied from the CACHE. 

— LFIL: 82495XP Cache line fill. 82495XP will gen- 
erate Memory Bus cycles to fetch a new line and 
deposit into the cache. 

— ■ RNRM: 82495XP Read Normal Cycle: This is a 
normal read cycle which will be executed by the 
82495XP for non-cacheable accesses. 

— SRUP: 82495XP SRAM UPDATE. Occurs any 
time new information is placed in the 82495XP 
cache. An SRAM update is implied in the LFIL 
cycle. 

— ALLOC: 82495XP ALLOCATION. Write Miss cy- 
cle that has determined to be cacheable so the 
82495XP issues a line read. 

NOTE: 

1. An inquire cycle may be executed with INV ac- 
tive, performing a back-invalidation simultaneously. 
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STATE TABLES 



Table 4-1. Master 82495XP Read Cycle 



Pres. 
State 


Condition: Next State 


Mem 

Bus 

Activity 


CPU 

Bus 

Activity 


Comments 


M 


!LOCK: M 


— 


!NWT 


Normal Read Hit [M] 


LOCK: M 


RTHR 


!KEN 


Read Through Cycle, Data From 
Array 


E 


!LOCK:E 


— 


NWT 


Normal Read Hit [E] 


LOCK:E 


RTHR 


!KEN 


Read Through Cycle, Data From 
Memory 


S 


!LOCK.!TRO: S 


— 


NWT 


Normal Read Hit [S] 


ILOCK.TRO: S 




!KEN 


Normal Read to Read-Only 
sector. Stays in [S] state and 
deactivate KEN to prevent CPU 
from caching line 


LOCK:S 


RTHR 


!KEN 


Read Through Cycle, Data from 
Memory 


I 


PCD+!MKEN + LOCK:l 


RNRM 


!KEN 


Non-Cacheable Read, Locked 
cycles 


!PCD.MKEN.!LOCK.MRO: S 


LFIL 


!KEN 


Cacheable read, Read-Only. Fill 
line to 82495XP. Do not allow 
CPU to cache line by deactivating 
KEN # . Set the 82495XP's TRO 
bit to indicate the sector read only 
attribute 


!PCD.MKEN.!LOCK.!MRO.(PWT+MWT):S 


LFIL 


NWT 


Cacheable Reads, forced Write- 
Through 


!PCD.MKEN.!LOCK.!MRO.!PWT.!MWT.!DRCTM: E 


LFIL 


NWT 


Line not shared, thus enabling the 
82495XP to move into tan 
exclusive state 


!PCD.MKEN.!LOCK.!MRO.!PWT.!MWT.DRCTM:M 


LFIL 


NWT 


As before with direct [M] state 
transfer. Keep i860 XP CPU in 
Write Through mode 
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Table 4-2. Master 82495XP Write Cycle 



Pres. 
State 


Condition: Next State 


Mem 

Bus 

Activity 


CPU 

Bus 

Activity 


Comments 


M 


ILOCK: M 


■ 


SRUP, 
!NWT 


Write hit. Write to cache. Allow i860 XP 
CPU to perform internal write cycles 
(Enter intor [E], [M] states). 


LOCK: M 


WTHR 


SRUP, 
!NWT 


Locked Cycle. Write-Through updating 
cache SRAM. Most updated copy of the 
line is still owned by 82495XP. All 
Locked write cycles are posted. 


E 


(LOCK: M 


- 


SRUP, 
!NWT 


Write hit. Update SRAM. Let i860 XP 
CPU execute internal write cycles. 


LOCK:E 


WTHR 


SRUP, 
NWT 


Lock forces cycle to memory bus. Main 
memory remains updated. 


S 


TRO:S 


WTHR 


■ 


Read-Only. Write cycle with write 
through attribute from CPU or Memory 
Bus. Locked Cycles. 


!TRO.(PWT+MWT+LOCK):S 


WTHR 


SRUP, 
NWT 


Not Read-Only. Write cycle with write 
through attribute from CPU or Memory 
Bus. Locked Cycles. 


!TRO.!PWT.!LOCK.!MWT.!DRCTM: E 


WTHR 


SRUP, 
NWT 


Not Read-Only. No write-through cycle, 
no lock request allow going into 
exclusive state. 


!TRO.!PWT.!LOCK.!MWT.DRCTM: M 


WTHR 


SRUP, 
NWT 


Not Read-Only. No write-through cycle, 
no lock request allow going into 
exclusive state. DRCTM forces final 
state to M. 


1 


PCD + !MKEN + PWT + LOCK + MRO: 1 


WTHR 


- 


Write Miss Non-Cacheable, Write- 
Through, locked cycle or Read-Only. 


!PCD.MKEN.!PWT.!LOCK.!MRO: 1 

!PCD.MKEN.!PWT.!LOCK.MRO:S 

Allocation Final State 
MWT:S 

!MWT.!DRCTM:E 
!MWT.DRCTM:M 


WTHR, 
LFIL 

ALLOC 




Write Mis with allocation/After the write 
cycle, a line fill (allocation) is scheduled. 
If MKEN and MRO are asserted, an 
allocation to the [S] state will occur 
Allocation final state as a function of 
line fill attributes. 



NOTE: 

The WB/WT# pin will only be activated for 82495XP lines that are in the [M] state. In this state, the 82495XP always 
assumes that the line owner MAY be the i860 XP CPU. On all other states the i860 XP CPU will be forced to perform Write- 
Through cycles. This mechanism will make sure that any i860 XP CPU write cycle is seen at least once on the CPU Bus. 
Allocations, which are consequences of write-misses, will disregard the MKEN# and MRO# attributes during the line fill. In 
other words, once an allocation is scheduled, it cannot be cancelled. 
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Table 4-3. Snooping 82495XP without Invalidation Request 



Pres. 
State 


Condition: 
Next State 


Mem 

Bus 

Activity 


CPU 

Bus 

Activity 


Comments 


M 


ISNPNCA: S 
SNPNCA: E 


MTHIT 
MHITM 
WBCK 


INQR 


Snoop hit to modified line. 82495XP indicates tag hit and 
modified hit. 82495XP schedules flushing of the modified 
line to memory. If non-cacheable device, stay in [E] state. 


E 


ISNPNCA: S 
SNPNCA: E 


MTHIT 


■ 


If snooping by cacheable device, indicate MTHIT and go 
to shared state. If no caching device only indicate MTHIT, 
stay exclusive. 


S 


S 


MTHIT 


- 




1 


1 


- 


- 





NOTE: 

Usage of DRCTM# to avoid [E] states may be in conflict with the SNPCNA cycle attribute. Note in the table that snoops 
with SNPNCA may cause an [E] state transition. 

Table 4-4. Snooping 82495XP with Invalidation Request 




Pres. 
State 


Next State 


Mem 

Bus 

Activity 


CPU 

Bus 

Activity 


Comments 


M 


1 


MTHIT 
MHITM 
WBCK 


INQR, 
BINV 


Snoop hit to modified line. 82495XP indicates tag hit and 
modified hit. 82495XP schedules flushing of the modified 
line to memory. Invalidate CPU. 


E 


1 


MTHIT 


BINV 


Inidicate tag hit, infalidate 82495XP, CPU lines. 


S 


1 


MTHIT 


BINV 


Same as before 


1 


1 


- 


- 




Table 4-5. SYNC Cycles 


Pres. 
State 


Next State 


Mem 

Bus 

Activity 


CPU 

Bus 

Activity 


Comments 


M 


E 
E 


WBCK 
WBCK 


INQR 


Get modified data from i860 XP CPU, flush to memory 


E 


E 


- 


- 


Memory already synchronized 


S 


S 


- 


- 


Memory already synchronized 


1 


1 


- 


- 
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Table 4-6. FLUSH Cycles 



Pres. 
State 


Next State 


Mem 

Bus 

Activity 


CPU 

Bus 

Activity 


Comments 


M 




WBCK 


INQR, 
BINV 


Flush and invalidate i'860tm xp CPU 


E 




— 


BINV 


Invalidate i860 XP CPU 


S 




— 


BINV 


Invalidate i860 XP CPU 


1 




— 


— , 





NOTE: 

Usage of DRCTM# to avoid [E] states may be in conflict with the SYNC cycle. Note in the table that SYNC cycles move an 
[M] state line to [E]. 



5.0 CONFIGURATIONS 

The 82495XP/82490XP cache system was de- 
signed to fit a variety of applications. For the great- 
est performance, each application requires the 
82495XP/82490XP to be configured differently. The 
82495XP/82490XP therefore has many possible 
configurations that are set on RESET and affect the 
82495XP/82490XP architecture, operation, and 
electrical characteristics. 



5.1 Physical Cache 

The physical configurations of the 82495XP/ 
82490XP consist of parameters that alter the 
82495XP/82490XP basic architecture. These are 



line ratio, tag size, lines per sector, bus width, and 
cache size. These parameters are sampled at the 
falling edge of RESET and are not dynamically 
changeable. 

Because of physical cache constraints, choosing 
one parameter limits the flexibility of other parame- 
ters. The following table summarizes the possible 
i860 XP CPU basic cache configurations. CFG0- 
CFG2 are multiplexed to select one of 5 possible 
line ratio/tag size/ lines per sector configurations. 
This information is automatically passed from the 
82495XP to 82490XP during RESET. CFG0-CFG3 
must be valid at least 10 clocks before RESET'S fall- 
ing edge. 



MEM BUS = 64 Bits 


MEM BUS = 128 Bits 


Number of 
82490XP Devices 


4 Trans. 


8 Trans. 


4 Trans. 


8 Trans. 


/ 
LR =1 
Tags = 8k 
L/S = 1 


2 
LR = 2 
Tags = 4k 

L/S = 1 






8 
16 


3 
LR = 1 
Tags = 8k 
L/S = 2 


4 
LR = 2 
Tags = 8k 
L/S = 1 


4 
LR = 2 
Tags = 8k 
L/S = 1 


5 
LR = 4 
Tags = 4k 
L/S = 1 . 



Not Supported LR = 82495XP/CPU Line Ratio 
L/S = 82495XP Lines/Sector 



Cache Device 
2, 4, 8 Bits Wide 



Figure 5-1. 82495XP/82490XP Configurations 
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Table 5-1. CFG Configuration Inputs 


Cfig 
No. 


Line 
Ratio 


Lines/ 
sec 


No. of 
Tags 


CFG2 


CFG1 


CFGO 


1 


1 


1 


8K 








1 


2 


2 


1 


4K 


1 


1 


1 


3 


1 


2 


8K 











4 


2 


1 


8K 





1 


1 


5 


4 


1 


4K 


1 


1 






5.1.1 LINE RATIO (LR) 

Line Ratio (LR) is the ratio of the 82495XP/82490XP 
cache line size to the CPU cache line size. For ex- 
ample, if LR = 2 then the 82495XP/82490XP line 
size is 64 bytes. This information is also used to de- 
termine the number of back invalidations or inquire 
cycles to the i860 XP CPU. 

5.1.2 TAG SIZE (TAGS) 

The 82495XP/82490XP cache tag size may be 4K 
or 8K tag entries. By reducing tag size, the line ratio 
(LR) can be doubled without a change in cache size. 

5.1.3 LINES PER SECTOR (L/S) 

The 82495XP/82490XP may be non-sectored (L/S 
= 1) or contain two lines per sector (L/S = 2). If 
L/S = 2, then the 82495XP contains one tag for two 
consecutive cache lines and each cache line has its 
own set of MESI state bits. This allows just one line 
to be filled on replacements or written back on 
snoop hits. Both lines are written back during re- 
placements, if both are modified. 



5.1.4 BUS SIZE 

The 82495XP/82490XP supports 64 and 128 bit 
memory bus widths for the i860 XP CPU. 



5.1.5 CACHE SIZE 

The 82495XP/82490XP may be configured to be 
256K or 51 2K. Cache size is a direct result of the 
number of 82490XP devices used. It takes 8 
82490XP's to make a 256K byte cache and 16 
82490XP's for a 51 2K cache. 



5.1.6 FUNCTION AND ADDRESS 

CONNECTIONS (CFA0-CFA6) 

The following table lists which address lines should 
be connected to each of the CFA0-CFA6 lines for 
each cache configuration. CFA0-GFA6 provide the 
82495XP with proper multiplexed addresses for 
each of the possible cache configurations. Depend- 
ing on the mode selected, either CFA5 or CFA4 will 
operate as the 82495XP's CTYP input. This input is 
connected to the i860 XP CPU's CTYP output. 
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Table 5-2. CFA Address Connections 


Cfig 
No. 


Line 
Ratio 


Lines/ 
sec 


No. of 
Tags 


CFA6 


CFA5 


CFA4 


CFA3 


CFA2 


CFA1 


CFAO 


1 


1 


1 


8K 


A5 


CTYP 


A31 


A30 


A29 


A4 


A3 


2 


2 


1 


4K 


A5 


CTYP 


A31 


A30 


A29 


A4 


A3 


3 


1 


2 


8K 


A6 


A5 


CTYP 


A31 


A30 


A4 


A3 


4 


2 


1 


8K 


A6 


A45 


CTYP 


A31 


A30 


A4 


A3 


5 


4 


1 


4K 


A6 


A5 


CTYP 


A31 


A30 


A4 


A3 



5.2 Cache Modes 

Cache modes are ways of configuring the 
82495XP/82490XP to operate differently. These op- 
tions are all sampled at RESET and are not dynami- 
cally changeable. If some of these configuration op- 
tions share a pin, such as the 82495XP's SYNC# 
and MEMLDRV, the configuration option must meet 
a specific setup and hold time to RESET'S falling 
edge. For the 82495XP, setup time is usually 4 
clocks, and for the 82490XP, setup time is usually 1 
clock. For both parts, the configuration option must 
be held until RESET is detected low. 



CLK / \ / 


\ 






RESET 

»)oocooooq< 




xxxx 












Setup 


Hold 
240 


956-10 



5.2.1 MEMORY BUS MODES 

The 82495XP/82490XP may be configured to have 
a clocked or strobed memory bus. Memory bus 
mode is selected by the 82490XP MSTBM pin (same 
as MCLK pin). If MSTBM is strapped high, the 
82490XP's operate in strobed mode. If MSTBM is 
toggling, ie it is connected to the memory bus clock, 
the 82490XP operates in clocked mode. MCLK need 
not be synchronous to CLK. 



5.2.2 SNOOPING MODES 

The 82495XP/82490XP supports three snooping 
modes: synchronous, clocked, and strobed. Snoop- 
ing mode is selected by the SNPMD (same as 
SNPCLK) pin. If SNPMD is low the 82495XP snoops 
synchronously. If SNPMD is high the 82495XP 
snoops in strobed mode. If SNPMD is toggling, 
clocked mode is selected and SNPMD becomes a 
snoop clock source, SNPCLK, which clocks in the 
snoop requests. 



Figure 5-2. Configuration Input Sampling 
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These three snooping modes only alter the way the 
memory bus controller may initiate a snoop request 
to the 82495XP. The 82495XP response is always 
synchronous to the CPU CLK. 

5.2.3 BUS DRIVERS 

The 82495XP/82490XP provide 2 types of memory 
bus drivers: High capacitance drivers and low capac- 
itance drivers. The high capacitance drivers are se- 
lected by driving both the 82495XP and 82490XP 
MEMLDRV pins low at RESET. Similarly, the low ca- 
pacitance drivers are selected with MEMLDRV high. 

With C490LDRV the 82495XP also provides two 
types of drivers when driving the 82490XP's. Refer 
to the interface document to determine C490LDRV. 



5.2.4 STRONG/WEAK WRITE ORDERING 

If the 82495XP pin WWOR# is sampled low at 
RESET, the 82495XP enforces weak write-ordering. 
If sampled high; the 82495XP enforces strong write- 
ordering. Strong write-ordering prevents the 
82495XP from completing a write cycle that would 
go to 'M' state if a posted write is pending (has not 
been granted the bus with BGT#). By doing this, 
strong ordering ensures that write cycles from the 
CPU are written to memory in the same order that 
they appear in the i860 XP CPU program. 

5.2.5 I860TM XP CPU PFLD SUPPORT 

The i860 XP microprocessor executes PFLD (Pipe- 
lined Floating-Point Load) instructions to implement 
special data handling, typically for vector operations. 
This instruction allows loading of data through a 
FIFO pipeline, to hide memory latency. The i860 XP 
CPU does not cache data returned by a PFLD cycle. 

The 82495XP can be configured to decode the 
i860 XP microprocessor's PFLD cycles. The 
82495XP supports 3 operational modes fpr PFLD 
cycle decoding: 

Mode #1. PFLD cycles are cached in the 82495XP. 

This mode is used in applications that 
can fit entirely in the 82495XP/82490XP 
cache. The 82495XP treats PFLD cycles 
as normal read cycles. 



Mode #2. PFLD cycles are not cached in the 
82495XP, without an external PFLD ex- 
tension FIFO. 

This mode is used when applications are 
too large to fit in the 82495XP/82490XP 
cache. The 82495XP treats PFLD cycles 
as noncacheable, using the same proto- 
col as cycles with PCD=1 (if data is al- 
ready cached, it will be supplied from the 
cache). 

Mode #3. PFLD cycles not cached in the 82495XP, 
with an external PFLD extension FIFO. 

This mode allows the PFLD FIFO to be 
extended beyond the three stages built 
into the i860 XP CPU by adding external 
FIFO hardware. The 82495XP, treats 
PFLD cycles in the same manner as its 
treatment of LOCKed cycles (all cycles 
go to the bus, even if data already pres- 
ent in cache). To support the external 
FIFO, the 82495XP identifies PFLD cy- 
cles by asserting its FPFLD output. For 
proper operation, data which can be ac- 
cessed by PFLD must never be in the 
cache in the Modified state, and software 
must be aware of the length of the com- 
bined PFLD pipeline. Because this mode 
is not software transparent, it must be 
used with extreme care. 

The choice of PFLD mode is largely application de- 
pendent. The PFLD mode of the 82495XP is select- 
ed by configuration pins FPFLDEN and NCPFLD#, 
which are sampled at RESET. FPFLDEN shares a 
pin with FPFLD, and NCPFLD# shares a pin with 
FLUSH #. Depending on the PFLD mode, data for 
reads will either be supplied to the CPU from the 
82495XP, or from the memory bus. Table 5-3 sum- 
marizes, the 82495XP's support for i860 XP CPU 
PFLD cycles. 
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Table 5-3. 82495XP PFLD Modes 


Mode# 


FPFLDEN 


NCPFLD# 


Data Supplied From 


Line Fill 
on[l] 


[1] 


[S] 


[E] 


[M] 


1 





1 


MEMBUS 


CACHE 


CACHE 


CACHE 


Yes 


2 


0. 





MEMBUS 


CACHE 


CACHE 


CACHE 


No 


3 


1 


1 


MEMBUS 


MEMBUS 


MEMBUS 


-MEMBUS 


No 


X 


1 





Illegal Mode 



5.3 82490XP Bus Configuration 

The 82490XP needs to be configured so it knows to 
drive 4 or 8 MDATA lines and whether it should do 4 
or 8 memory transfers per line fill. This is done 
through the MX4/MX8# and the MTR4/MTR8# 
configuration inputs. For a given line ratio (memory 
bus line size / CPU line size), they should be sam- 
pled as follows: 

Table 5-4. MX/MTR Configurations 



Line 
Ratio 


MX4/ 
MX8# 


MTR4/ 
MTR8# 


Membus 
I/O 


CPUbus 
I/O 


1 


1 


1 


4 


4 


2 


1 





4 


4 


2 





-■ 1 


8 


4 


4 








8 


4 


1 





1 


8 


8 


2 








8 


8 



5.3.1 82490XP PARITY CONFIGURATION 

A 82490XP may be designated as a parity device. 
This is done by strapping the PAR# pin low. In this 
configuration CDATA[0:3] are used to store 4 parity 
bits, and CDATA[4:7] are used as 4 bit enables. The 
four bit enables allow the writing of individual parity 
bits. 

Every mode and configuration of a non-parity 
82490XP may be used and selected on the parity 
82490XP device. The 82490XP parity configurations 
are as follows: 

Table 5-5. Parity Configurations 



Cache 
Size 


Memory 

Bus 

Width 


Number 
of Parity 
Devices 


82490XP 

I/O bits 

(CPU/Mem) 


256K 


64 


2 


4/4 


51 2K 


128 


2 


4/8 



5.3.2 CPU 82490XP ADDRESS 
CONFIGURATIONS 

The 82490XP Address inputs (A) are multiplexed to 
the CPU address lines (CA) according to the cache 
size: 



Table 5-6. 82490XP Address Connections 



Size 


82490XP Address Pins 


A15 


A14 


A13 


A12 


A11 


A10 


A9 


A8 


A7 


A6 


A5 


A4 


A3 


A2 


A1 


A0 


256K 


CA 
16 


CA 
15 


CA 
14 


CA 
13 


CA 
12 


CA 
11 


CA 
10 


CA 
9 


CA 
8 


CA 
7 


CA 
6 


CA 

5 


CA 
4 


CA 
3 


CA 


Vss 


512K 


CA 
17 


CA 
16 


CA 
15 


CA 
14 


CA 
13 


CA 
12 


CA 
11 


CA 
10 


CA 
9 


CA 
8 


CA 
7 


CA 
6 


CA 
5 


CA 
4 


CA 
3 


CA 
2 



NC = No Connect. 
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6.0 CACHE OPERATION 
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Cyclo 
Progress 
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Snoop 
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Control 
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Data 
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Cyclo 

Control 



Address 
XVRs/LATCHES 



Data 



Optional D 



MEMORY BUS 



$ 




Figure 6-1. Memory Bus Controller Interface Model 



Figure 6-1 shows the memory bus controller (MBC) 
interface model. The memory bus controller interfac- 
es to the i860 XP CPU, 82495XP, 82490XP, and 
memory bus. The MBC interface was defined with a 
minimal set of assumptions as to the memory bus 
implementation. The chipset was designed to enable 
flexibility in the design of a memory bus and control- 
ler. 



The 82495XP requests control of the memory bus 
by signalling the memory bus controller. The memo- 
ry bus controller is responsible for arbitrating and 
granting the bus to the 82495XP. Once granted, the 
memory bus controller is responsible for executing 
the requested cycle, snooping the other caches, and 
ending the cycle. The 82495XP supports different 
modes of snooping, different modes of memory bus 
operation, and various special cycles. Memory Bus 
Controller design dictates which of these features 
are used, and exactly, how they are used. 
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6.1 Cycle Attribute and Progress 
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Figure 6-2. Cycle Attribute and Progress Signals 

CADS# indicates the start of the cycle address 
phase. CDTS# tracks CADS# and indicates the 
start of the cycle data phase. For READ cycles it 
indicates that starting in the next CLK the CPU data 
bus is in read mode under the control of the MBC 
until the last BRDY#. In Read cycles, if the MBC 
already owns the CPU data bus, CDTS# will be acti- 
vated with CADS#. For ALLOCATE cycles the MBC 
does not need the CPU data bus, therefore CDTS# 
is activated together with CADS # . 

For Write cycles CDTS# indicates that the 1st piece 
of data is available on the memory bus. For write- 
back cycles CDTS# indicates that all data is avail- 
able (write-back buffer or snoop buffer loaded with 
correct write-back data). 

As a response to the cycle request, the memory bus 
controller responds with cycle progress signals. All 
cycle progress signals are sampled ONCE in specif- 
ic windows and then ignored until CRDY# of the 
corresponding cycle. BGT# indicates a commitment 
by the memory bus controller to complete the cycle 
execution on the memory bus. Up until this point the 
82495XP owns the cycle. This means that interven- 
ing snoop-write-backs will abort it and the 82495XP 
re-issues the cycle to the MBC. There is only one 
case where the 82495XP will issue a new, not a re- 
issued, cycle; if the original CADS# operation is a 
write-back cycle, and the interrupting snoop cycle 
hits that write-back buffer, then the subsequent 
CADS# will be for a completely new cycle (not a re- 
issuing of the interrupted CADS# operation). 



After BGT# the memory bus controller owns the cy- 
cle. The 82495XP assumes the cycle will terminate 
and will not re-issue it on snoop-write-backs. Follow- 
ing BGT# comes KWEND# which indicates that the 
cacheability window is closed and that the 82495XP 
can sample MKEN#, MRO# attributes. Those indi- 
cate to the 82495XP cacheability and read-only re- 
spectively. These attributes can be determined by 
decoding the 82495XP address. Based on those at- 
tributes the 82495XP executes ALLOCATIONS, 
Line-fills, Replacements, etc. 

Following KWEND#, SWEND# is activated. It indi- 
cates that the Snopp Window is closed. The 
82495XP samples MWB/WT# and DRCTM# attri- 
butes. These attributes are determined by snooping 
the other caches in the system. At this point the 
82495XP updates its TAG RAM state related to the 
line access in progress. 

Lastly the MBC issues CRDY#, which indicates to 
the 82495XP the end of the transaction data phase. 

The 82495XP allows memory bus pipelining by pro- 
viding CNA# which allows the MBC to request a 
new address phase before the conclusion of the cur- 
rent data phase. The 82495XP supports a 1 level 
. deep address pipeline on the Memory Bus. 



6.2 Snoop Operations 

The 82495XP provides the capability of snooping 
operations on the memory bus to ensure cache con- 
sistency. A snoop operation consists of two phases: 
1) initiation phase and 2) response phase. 



< 






> 
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initiat 


ion ^> 


<^ Response 



Figure 6-3. 82495XP Snooping Operations 

During the initiation phase the MBC provides the 
82495XP with the snoop address information. During 
the response phase the 82495XP provides the 
snoop status information. 
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6.2.1 SNOOP INITIATION PHASE 

The 82495XP provides three modes for initiating 
snoops: 

1. Strobed: the falling edge of SNPSTB# is used. 

2. Clocked: SNPSTB# is sampled with SNPCLK. 

3. Synchronous: SNPSTB# is sampled with CLK. 

These three snooping modes are configured as fol- 
lows: 

1. Strobed: The SNPCLK [SNPMD] signal must be 
strapped high. 

2. Clocked: The SNPCLK[SNPMD] signal must be 
connected to the snoop clock source. 

3. Synchronous: The SNPCLK [SNPMD] signal 
must be strapped low. 



NOTE: 

The 82495XP samples the SNPCLK[SNPMD] sig- 
nal at the falling edge of RESET to determine the 
snoop mode. If a rising edge occurs on the 
SNPCLKISNPMD] after RESET has gone inactive, 
clocked mode will be selected. Systems using 
stobed or synchronous mode must ensure that no 
rising edge occur on SNPCLK[SNPMD] after RE- 
SET has gone inactive. 

Figure 6-4 shows the strobed method of snoop ini- 
tiation. The memory address, SNPNCA, SNPINV, 
and MBAOE# are latched with the falling edge of 
the SNPSTB#. If MAOE# is sampled active (low), 
the SNPSTB# will not cause a snoop. The snoop 
initiation is recognized by the 82495XP, is synchro- 
nized in the next clock, and causes a snoop in the 
following clock. 
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Figure 6-4. Strobed Snoop Mode 
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Figure 6-5 shows the clocked method of snoop ini- 
tiation. The memory address, SNPNCA, SNPINV, 
and MBAOE# are latched with the rising edge of 
SNPCLK when SNPSTB# is first sampled low. 
SNPSTB# must be sampled high for at least one 



SNPCLK in order to rearm for another snoop. If 
MAOE# is sampled active (low), the SNPSTB# will 
not cause a snoop. The snoop initiation is recog- 
nized by the 82495XP, is synchronized in the next 
clock, and causes a snoop in the following clock. 
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Figure 6-5. Clocked Snoop Mode 
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Figure 6-6 shows the synchronous snoop mode. The 
memory address, SNPNGA, SNPINV, and MBAOE# 
are latched with the rising edge of CLK when 
SNPSTB# is first sampled low. SNPSTB# must be 
sampled high for at least one CLK in order to rearm 



for another snoop. If MAOE# is sampled active 
(low), the SNPSTB# will not cause a snoop. The 
snoop initiation is recognized by the 82495XP, and 
causes a snoop in the next clock. 
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Figure 6-6. Synchronous Snoop Mode 
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6.2.2 RESPONSE PHASE 

The snoop response phase consists of two parts: 
1) 82495XP state indication 2) 82495XP snoop pro- 
cessing completion. The response phase is' AL- 
WAYS synchronous with the CPU CLK. The 
82495XP state indication is presented on MHITM# 
and MTHIT# and remains stable until the next 
snoop. These signals indicate the state of the 
82495XP line just prior to the snoop operation. The 
memory bus controller can predict the final state of 
the 82495XP line knowing the initial state and the 
SNPINV and SNPNCA, inputs. The snoop comple- 
tion information is determined by the SNPBSY# out- 
put. The SNPBSY# output inactive indicates that 
the 82495XP is ready to accept another snoop cy- 
cle. 

Figure 6-7 shows the 82495XP response to snoops 
without invalidation. The first snoop is to a line which 
is not currently stored in the cache. 

Figure 6-8 'shows the 82495XP response to snoops 
with invalidation. 

The SNPBSY# signal will be activated for one of 
two reasons: 1) a snoop hit to a modified line, 
SNPBSY# will remain active until the modified line 



has been written back. 2) a Back invalidation is 
needed and there is a back invalidation in process. 
The SNPBSY# minimum active time is two CLK pe- 
riods. This allows an external logic to trap-hold ac- 
tive SNPBSY# using CLK. The external logic must 
first look for active SNPCYC# and then trap-hold 
SNPBSY#. 

6.2.3 PIPELINED SNOOPS 

The 82495XP allows the memory bus controller to 
pipeline snoop operations. The 82495XP allows the 
next snoop address to be supplied and the next 
snoop requested before the last snoop has complet- 
ed. 

There are a set of rules which govern the operation 
of pipelined snoops. These rules are as follows: 

(1) For strobed mode snoops, the memory bus con- 
troller cannot cause a second falling edge of 
SNPSTB# until after the falling edge of 
SNPCYC#. 

(2) For clocked mode snoops, the memory bus con- 
troller cannot cause a second falling edge of 
SNPSTB# to be sampled by SNPCLK, until after 
the falling edge of SNPCYC#. 



I State 



E, S State 



r 



_ M State _ 



SNPINV 



SXXXT m 7X XA 



MHITM# 



SNPBSY# 









■m. 



■^ook—r 



240956-17 



Figure 6-7. Snoops without Invalidation 
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Figure 6-8. Snoops with Invalidation 
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Figure 6-9. Fastest Possible Synchronous Snooping 
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Figure 6-10. Fastest Possible Asynchronous Snooping 



(3) For synchronous mode snoops, the memory bus 
controller cannot cause a second falling edge of 
SNPSTB# to be sampled by CLK, until the CLK 
after SNPCYC# is active. 



6.2.4 



OVERLAPPING SNOOPS WITH MEMORY 
BUS CYCLES 



The 82495XP allows snoops to be overlapped with 
data transfers. The 82495XP divides the memory 
bus cycle into 4 main regions as shown below: 



CRDY# CADS# BGT# SWEND# CRDY# CADS# 



Region 1 is after a previous memory bus cycle (i.e. 
after CRDY#) and before the new memory bus cy- 
cle starts (before CADS#). A snoop in this region is 
looked up immediately and serviced immediately. 



Region 2 is after a memory bus cycle has started 
(CADS#) but before the 82495XP has been granted 
the bus (BGT#). A snoop in this region is looked up 
immediately and serviced immediately. CADS# is 
re-issued for the aborted cycle once the snoop com- 
pletes. 

Region 3 is after the 82495XP has been granted the 
bus and before the SWEND# is completed. A snoop 
in this region has its lookup blocked until after the 
SWEND#. After SWEND#, the snoop response is 
given, but no write-back will be initiated until after 
CRDY#. 

Region 4 is after SWEND# and before CRDY#. A 
snoop in this region is looked up immediately but 
serviced after CRDY#. This snoop is logically treat- 
ed as if it occurred after CRDY# (snoop hits to mod- 
ified data will schedule a write-back which will be 
executed after the conclusion of the current memory 
bus cycle). Note that the result of the snoop 
MHITM#, MTHIT# will be available immediately 
with the look-up. 
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6.2.5 SNOOP INTERLOCK 

The 82495XP uses two interlock mechanisms to en- 
sure that Snoops are identified within the proper re- 
gion. The first interlock ensures that once a BGT# 
has been given snoops are blocked until after 
SWEND#. The second interlock ensures that once 
a snoop has been started BGT# cannot be given 
until after the snoop has been serviced. 

Figure 6-11 shows how once the 82495XP sees a 
BGT# it blocks all snoops until after SWEND#. If a 
snoop has been initiated, and no SNPCYC# has 
been issued before BGT# assertion, the snoop has 
been blocked. 

Figure 6-12 shows a snoop occurring before BGT#. 
Once the 82495XP has honored a snoop, the 
82495XP, depending on the result of the snoop, may 
ignore BGT# until the snoop is serviced. The 
82495XP will always ignore BGT# when SNPCYC# 



is active. If the snoop result is a hit to a modified line 
(MHITM# active), the 82495XP will ingore BGT# as 
long as both SNPBSY# and MHITM# remain ac- 
tive. In this case, it is the memory bus controller's 
responsibility to hold BGT# until SNPBSY# goes 
inactive or reassert it after SNPBSY# becomes in- 
active. If the snoop result is not a hit a modified line 
(MHITM# inactive), the 82495XP is capable of ac- 
cepting BGT# even when SNPBSY# is active. This 
allows the memory bus controller to preceed with a 
memory bus cycle by asserting BGT# while the 
82495XP is performing back-invalidations. 

These two interlock mechanisms provide a flexible 
method of ensuring predictable handling of over- 
lapped snoops. 

NOTE: 

Even when snoops are delayed, address latching is 
performed with SNPSTB# activation. 
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Figure 6-11. BGT# Blocking a Snoop 
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Figure 6-12. Snoop Occurring before BGT# 
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6.2.6 



SNOOPS CONCURRENT WITH LINE FILL 
CYCLES 



During snoops concurrent with line-fills/allocates, 
the following responsibility boundaries must be full- 
filled in order to insure data consistency: 

o If a snoop happens before BGT#, more precisely 
if SNPCYC# is active before BGT#, it is the sys- 
tem's responsibility not to return stale data within 
the line-fill/allocation. 

o If a snoop happens after BGT#, more precisely if 
SNPCYC# is active after BGT#, then the 
82495XP insures data consistency by providing 
interlocks with the CPU which avoid caching of 
stale data. 



6.3 Memory Bus Controller Interface 
Rules 

To begin a cache cycle, the 82495XP outputs the 
CADS# signal. The cache address and other cycle 
parameters are guaranteed to be stable with 
CADS# assertion. These parameters are guaran- 
teed to be stable until CNA# or CRDY# of that cy- 
cle. After CNA# or CRDY# these parameters are 
undefined. 

Either during, or after CADS# the CDTS# signal is 
asserted. Data is guaranteed to stable with CDTS# 
assertion, or the data path is available. 



BGT# and CRDY# are required for all (non-snoop) 
cycles. KWEND# and SWEND# are only required 
for those cycles which sample them. 

Once a signal has been sampled, it is a "don't care" 
until CRDY# of that cycle. Additionally, these sig- 
nals plus the attributes MRO#, MKEN#, MWB/ 
WT#, and DRCTM# need only follow setup and 
hold times when they are being sampled. 

For pipelined cycles, the cycle attributes (BGT#, 
KWEND#, . . . ) will only be sampled after CRDY# 
of the previous cycle. 

Note that there are many other rules that govern 
when signals may be asserted in relation to one an- 
other. These may be found in the specific pin de- 
scriptions of each signal in chapter 7. 

Snoop-Write-Back cycles are a subset of the normal 
cycles. Snoop-Write-Back cycles are requested as a 
consequence of snoop hits to Modified lines. Those 
are intervening cycles and are requested by activat- 
ing SNPADS# instead of CADS#. For those cycles, 
the 82495XP only samples the CRDY# response. 
The 82495XP assumes that the memory bus con- 
troller owns the bus to perform the intervening write- 
back (restricted back-off protocol) and that no other 
agents will snoop this cycle. Also the 82495XP will 
ignore CNA# during Snoop-Write-Backs. 
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Figure 6-13. Cycle Progress 
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Figure 6-14. Cycle Progress for Snoop Cycles 



6.4 LOCK # Protocol 

The 82495XP provides a LOCK signal for the memo- 
ry bus called KLOCK#. KLOCK# is generated by 
the 82495XP whenever the CPU generates the 
LOCK# signal. KLOCK#, like the other cycle attri- 
butes, is valid with CADS # assertion. 

When the CPU generates a LOCK cycle, the 
82495XP always generates a bus cycle. LOCK cy- 
cles are non-cacheable to both the 82495XP and 
CPU, so the information is passed through the 
82490XPs to the CPU with BRDYs generated by the 
MBC. If the LOCKed read cycle is a hit in the 
82495XP, the 82495XP ignores the data that it is 
receiving and supplies data from the 82490XP array 
(in accordance with the BRDYs supplied by the 
MBC). Locked writes are posted like any other write. 
LOCKed cycles, both reads and writes, never 
change the 82495XP tag state. 

During a LOCKed cycle, the MBC must prevent oth- 
er masters from snooping the 82495XP. Specifically, 
the MBC must prevent SNPSTB# between BGT# 
of the first LOCKed transfer, and SWEND# of the 
last LOCKed transfer. 



6.5 Cycle Length 

When GADS# is generated, the 82495XP outputs 
CW/R# and MCACHE#. These signals provide the 
MBC with enough information to determine the type 
of 82495XP cycle. Table 6-1 summarizes the cycle 
types for the 82495XP/82490XP. All line-fills and 
write-backs to the 82495XP/82490XP cache oper- 
ate on the entire length of a cache line. 

In addition to the length of the cycle from, the 
82495XP/82490XP, the memory bus controller may 
need to determine the length of the cycle to the 
CPU. Specifically, for those 82495XP cycles where 
RDYSRC=1, the MBC must decode the i860 XP 
CPU's W/R#, LEN, and CACHE# outputs to deter- 
mine the number of BRDY#s which the MBC will 
provide to the CPU. These signals are captured for 
the current cycle by a user-provided BE latch (see 
Section 7.2 for details). Table 6-2 presents the CPU 
cycle length definitions; see the i860 XP microproc- 
essor Data Sheet (Order #240874) for further de- 
tails. 
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Table 6-1. 82495XP/82490XP Cycle Determination 




Cycle Type 


CW/R# 


RDYSRC 


MCACHE# 


MKEN# 


Posted Write 


1 





1 


X 


Write Backs 


1 








X 


Non-Cacheable Read 





1 


1 


X 


Non-Cacheable Read 





1 





1 


Cacheable Read 





1 








Allocation 











X 







Table 6-2. i860 XP CPU Cycle Determination 




W/R# 


LEN 


CACHE # 


MKEN# 


Cycle Description 


Burst Length 








1 


— 


Non-Cacheable 64-Bit Read 


1 








— 


1 


Non-Cacheable 64-Bit Read 


1 


1 





1 


— 


64-Bit Write 


1 


— 





1 


— 


I/O and Special Cycles 


1 





1 


1 


— 


Non-Cacheable 128-Bit Read 


2 





1 


— 


1 


Non-Cacheable 128-Bit Read 


2 


1 


1 


1 


— 


128-Bit Write 


2 





— 








Cache Line Fill 


4 


1 


— 





— 


Cache Write-Back 


4 




NOTE: 

If MRO# is asserted to the 82495XP, the effect on i860 XP CPU cycle determination is the same as when MKEN# = 1. 



6.6 Consecutive Cycles 

Because a 82495XP line can be longer than a CPU 
line, there are circumstances where a read miss will 
be to a line that is currently being filled. If this is the 
case, the 82495XP treats this like a read hit, but 
supplies data after CRDY# for the line fill. Data is 
supplied from the 82490XP array. 



6.7 CPU/Memory Bus Concurrency 

The 82495XP allows concurrency between the CPU 
and memory buses. CPU bus cycles will either be 
serviced locally by the 82495XP (hits) or require 
memory bus service. Whenever a CPU cycle re- 
quires memory bus service, it will be scheduled to 
run on the memory bus, and CPU bus activity will be 
allowed to continue. 

Examples of concurrency are: 

— Snoops and CPU bus operations 

— Posted writes with CPU and memory bus opera- 
tions 



— CPU bus operation on the back of long line fills 
(82495XP line longer than the CPU line) 

— Allocations and replacements with CPU and 
memory bus operations. 

In certain cases, consistency of data and prevention 
of deadlocks preclude concurrency. Problems may 
occur when the current memory bus cycle changes 
the tag state and therefore affects the operation of 
the next CPU cycle request. In those cases the 
82495XP will hold concurrency to ensure data con- 
sistency. Handling of those cases is completely 
transparent to the MBC. 

The 82490XP supports two modes of memory bus 
operation: clocked mode and strobed mode. In 
clocked mode, memory bus signals are sampled by 
the 82490XP on rising edges of MCLK. Similarly, 
memory bus data and signals are output by the 
82490XP with respect to MCLK (or MOCLK) rising 
edge transitions. 

In strobed mode, memory bus signals are sampled 
or output with respect to rising and falling edges of 
other signals. Strobed mode has the advantage of 
not requiring setup and hold times to a CLK or MCLK 
edge. 
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6.8 Memory Bus Modes 



Clocked Memory Bus Mode 

CLK 




240956-27 



Strobed Memory Bus Mode 




Active Inactive 



Figure 6-16. Clocked and Strobed 
Mode Sampling 

6.8.1 CLOCKED MODE 

In clocked mode operation MCLK is used to refer- 
ence the signals MDATA0-MDATA7, MSEL#, 
MFRZ#, MZBT#, MBRDY#, and MEOC#. Clocked 
mode will be selected if the 82490XP detects a 
clock at its MCLK input after RESET. MCLK need 
not have any relation to CLK. If this is the case, the 
memory bus is said to be operating in "clocked 
asynchronous" mode. If MCLK = CLK, the memory 
bus is operating in "clocked synchronous" mode. If 
MCLK X N = CLK (where N = 2, 3, 4 ... ), the 
memory bus is operating in "clocked divided syn- 
chronous" mode. These three clocked modes, asyn- 
chronous, synchronous, and divided synchronous, 
are not differentiated by the 82490XP. 

MOCLK controls a transparent latch at the 82490XP 
data output pins. If a clock is provided at this input, 
data is latched with MOCLK going low. This clock is 
available in clocked mode only. MOCLK allows the 
system to provide a greater MDATA hold time by 
skewing MOCLK from MCLK. If MOCLK is tied high, 
MDATA is driven from MCLK. 

6.8.1.1 Synchronous Clocked Mode 

In synchronous clocked mode MCLK = CLK. This 
means the CPU clock is used for 82495XP, 
82490XP, and the memory bus. A synchronous 
memory bus allows memory to communicate with 
the 82495XP without synchronizers since the 
82495XP runs with CLK. With a synchronous design, 
however, high clock frequencies must be routed to 
all parts of a system with minimal skew. This may 
not be possible with future projected frequencies. A 
synchronous memory system and memory bus con- 
troller must be redesigned when future speed up- 
grades are required. 



6.8.1.2 Asynchronous Clocked Mode 

In asynchronous clocked mode, MCLK is not the 
same frequency as CLK. Some memory signals, 
since they reference MCLK, must be synchronized 
to CLK to communicate with the 82495XP. For ex- 
ample, when a cycle completes, the memory system 
asserts a signal, driven from MCLK, to the memory 
bus controller which will be synchronized to CLK to 
become CRDY#. This is because CRDY# is syn- 
chronous to CLK and not MCLK. 

Asynchronous mode allows the rest of the system to 
run at a lower frequency than the CPU CLK. Not only 
does this simplify system design, but allows the de- 
signer to place hooks to allow the same design to 
scale easily to a higher frequency. If all the features 
of the 82495XP are used properly, an asynchronous 
memory design does hot have to incur much syn- 
chronization penalty. For example, MEOC# is syn- 
chronous to the memory environment (MCLK). This 
allows the memory system to end the current cycle 
and start the next before CRDY# is synchronized in 
the CPU environment. 



6.8.1.3 Divided Synchronous Clocked Mode 

Divided synchronous clocked mode is a subset of 
synchronous clocked mode. It allows two things to 
happen: One, the memory system is capable of 
communicating with the 82495XP without synchroni- 
zation. Two, a slower frequency clock may be routed 
around the system. 

Divided synchronous mode still requires clock skew 
restrictions. It also carries the same scalability draw- 
backs that full synchronous mode does. 



6.8.2 STROBED MODE 

Strobed mode is configured on the 82490XP by 
strapping MCLK high. In strobed mode: 

— MDATA0-MDATA7 are sampled with respect to 
edges of MEOC#, MISTB, and MOSTB. 

— For write cycles, MFRZ# is sampled when 
MEOC# goes active. 

— MZBT# is sampled when MSEL# is inactive, 
and is latched when MSEL# goes active. 
MZBT# is also sampled for the next operation 
when MSEL# is active and MEOC# goes active. 

By not using MCLK, strobed mode has no setup and 
hold time restrictions, and is scalable to higher fre- 
quencies. Strobed mode does, however, require 
synchronization to 82495XP CLK synchronous sig- 
nals. 
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6.9 Memory Bus Operation 

All data is handled by the 82490XP cache RAMs. 
The 82495XP instructs the 82490XP whether to use 
the data array or buffers, and specifically which buff- 
er to use. The MBC is responsible for bursting data 
in and out of the 82490XP's, in and out of the CPU 
during miss cycles, and indicating when the opera- 
tion is finished. Communication between the 
82490XPs and memory bus may be done in a 
clocked mode or strobed mode. See the Memory 
Bus Modes section for more details. 

A 82490XP has 4 memory buffers. It has 2 memory 
cycle buffers, one write-back buffer, and one snoop 
buffer. Each buffer is capable of holding an entire 
82495XP line of the longest configurable length. 

The memory cycle buffers of the 82490XP are used 
for posting writes and holding data during 
82495XP/82490XP line-fills. The write-back buffer is 
used for holding data from a cache replacement. 
This data is ready to be written out, and the write- 
back buffer is snoopable. The snoop buffer is used 
to hold modified data that has been hit by a snoop. 
Since snoop hits are the highest priority cycle, this 
buffer will be emptied before any other cycle or 
snoop request begins. 



6.9.1 82490XP BUFFERS AND MUXES 

The 4 82490XP memory buffers are all multiplexed 
(muxed) to the memory bus. The mux is used to se- 
lect which buffer is on the bus, and specifically which 
slice of that buffer is on the bus. MBRDY# assertion 
increments a counter for this mux which selects the 
next slice of that buffer. 

The counter used to increment through the buffer 
slices is called the memory burst counter. The mem- 
ory burst counter follows the CPU burst order de- 
pending on the subline address of the initial slice. 
Once the MBC is finished with a buffer, MEOC# is 
asserted to switch the mux to the next buffer to be 
used. MEOC# will also reset the counter and latch 
the last slice of data. 

On the CPU side, the 82490XP contains a CPU buff- 
er and mux. The CPU buffer captures data from the 
appropriate memory buffer or 82490XP array to feed 
it to the CPU. The mux selects which slice is muxed 
to the CPU bus. The counter for this mux is incre- 
mented with BRDY#. 

The 82490XP array contains a mux that selects 
which way, based on the MRU algorithm, will be 
read during hit cycles. This mux is used during write 
cycles to write to the correct way. 



6.9.2 MEMORY CYCLE BUFFERS 

There are 2 memory cycle buffers in the 82490XP. 
They are used for line-fills, allocates, and memory 
writes. The buffers are 64-bits wide (per 82490XP) to 
support 8 transfers with 8 memory bus I/O pins 
(maximum configuration). The 82490XP alternates 
use of these buffers. When one buffer has a posted 
write or is being used for a memory read, the other 
one is available for the next cycle. 

During allocation cycles, read for ownership may be 
implemented by using the MFRZ# signal. If MFRZ# 
is sampled active during the write cycle, the memory 
cycle buffer will freeze the write data in the buffer so 
the subsequent line-fill fills around it. This way the 
write cycle need not be written to memory. The line 
must be tagged as modified. 



6.9.3 



WRITE BACK BUFFER AMD SWOOP 
BUFFER 






The write back buffer and snoop buffer are both 64- 
bits to handle the maximum 82495XP line length. 
The write back buffer is used when replaced data 
must be written back to main memory (including 
FLUSH and SYNC cycles) and the snoop buffer is 
used when data must be written out from a snoop 
hit. 

Before a line fill begins, the 82495XP checks to see 
if it must remove a modified line to make room for 
the line-filled line. If so, the modified line is placed in 
the write back buffer and the line-fill is filled through 
a memory cycle buffer. Should the line-fill be select- 
ed as non-cacheable, both buffer contents are dis- 
carded and the 82490XP array value remains as it 
was before the line-fill. 

There is no need to run the line-fill, replacement 
(write back), FLUSH, or SYNC cycles contiguously. 
If a snoop is requested between the two cycles, the 
write back buffer is snoopable, and data can be writ- 
ten directly out of it if need be. 



6.9.4 MEMORY BUS CONTROL SIGNALS 

The main memory bus control signals are MSEL#, 
MEOC#, MBRDY#, and CRDY#. These signals 
control the 82490XP data path, buffers, and muxes. 

MSEL# selects which 82490XPs are being used in 
the current cycle by qualifying the MBRDY# signal. 
If MSEL# is inactive, MBRDY# is not recognized for 
that 82490XP. MSEL# is also used to reset the 
memory burst counter. If MSEL# goes inactive, the 
counter is initialized to its starting value. This is use- 
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ful for aborted /restarted cycles. MSEL# may remain 
active for many or all cycles. MSEL# must, howev- 
er, be inactive sometime after RESET to initialize the 
memory burst counter for the first time. 

MEOC# is asserted by the MBC to end finish with 
the current buffer, and switch the memory bus to the 
next buffer to be used. MEOC# latches in the last 
piece of data and resets the memory burst counter 
before switching to the new buffer. 

MBRDY# is used to increment the memory burst 
counter to select the next slice of data/This will 
strobe data out of the 82490XP (write cycles) or load 
data into the 82490XP (read cycles). MBRDY# is 
ignored by the 82490XP if MSEL# is inactive. 

CRDY# finishes the current cycle. Once CRDY# is 
asserted, the 82490XP disposes of the information 
in the buffers used in that cycle, and loads informa- 
tion into the 82490XP array. CRDY# must be as- 
serted on the clock or after MEOC# is asserted for 
a particular cycle. 



6.9.5 82490XP DATA PATH 

An example 82490XP read data path is shown in 
Figure 6-6. The path between the CPU and memory 
bus is a flow-thru' path, not a clocked path. Each 
entire 82495XP cache line of data in the CPU buffer 
is available at the memory buffer with some propa- 
gation delay. Likewise, each entire 82495XP cache 
line of data in the memory buffer is available in the 
CPU buffer with some propagation delay. Data is 
burst into and out of the memory buffer using 
MBRDY# or MISTB/MOSTB. Data is burst into and 
out of the CPU buffer using BRDY#. This means 
there is no synchronization required between memo- 
ry and CPU data paths. 

To give an example how the path works, during a 
CPU line fill, data may be returned to the CPU in two 
different fashions. One, each time the memory buff- 
er fills a dword, BRDY# may be asserted a clock 
later to burst it back to the CPU. Two, the memory 
buffer can be filled and then BRDY# asserted on 
four consecutive clocks to burst data back to the 
CPU. 
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Figure 6-17. 82490XP Read Data Path 
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6.9.6 WRITE CYCLES 

There are 3 basic types of write cycles: CPU gener- 
ated write cycles, write back cycles caused by a 
cache replacement, and snoop write back cycles 
caused by a snoop hit. All write cycles, except the 
snoop write back, begin with CADS# assertion. The 
snoop write back cycle begins with SNPADS#. 



6.9.6.1 CPU Generated Write Cycles 

When the CPU begins a write cycle, four things can 
happen to it. One, the CPU write is a hit to a modi- 
fied or exclusive line. In this case the write is termi- 
nated by the cache immediately and invisibly to the 
MBC. 

Two, the write is to a shared location. This type of 
write is posted to the 82490XP memory cycle buffers 
and the cycle is terminated by the cache. If a memo- 
ry cycle buffer is occupied with a write cycle, the 
CPU waits until the previous write completes. The 
write cycle must be written to the memory bus so 
that other copies of the write in other caches be 
invalidated. 

Three, the write is a cache miss. This type of write is 
posted to a memory cycle buffer if the 82490XP is 
not waiting for another posted write to complete. If 
PALLC# is asserted, the write may be turned into an 
allocation. 

Four, the write is a LOCKed write. LOCKed writes 
are posted regardless of the tag state. The write is 
then treated as if it were a miss except that there is 
no change in the tag state and no allocation allowed. 



6.9.6.2 Cache Generated Write Cycles 

The 82495XP/82490XP will generate a write cycle in 
three situations: a line fill or allocation causing a 
cache replacement, a snoop hit to a modified loca- 
tion, and write backs caused by FLUSH or SYNC. 
Write back caused by FLUSH or SYNC are indestin- 
guishable from write-back cycles caused by replace- 
ment. Cache generated write cycles are the length 
of a cache line. 

Cache replacements and FLUSH/SYNC cycles 
cause a line (or two lines if sectored) of cache data 
to be placed in the write-back buffer of the 82490XP. 
If no cycle is pending, CADS# is asserted and the 
data is written out. If a snoop hits the write-back 
buffer, the data is written out via SNPADS# like a 
normal snoop hit. The write back is then cancelled 
since the data was written through the snoop hit. 



A snoop hit to a modified location causes a line of 
cache data to be written out to memory. Snoop hits 
are the highest priority cycle and must be serviced 
immediately. A snoop hit to a modified location caus- 
es the snooped line to be written to the snoop buffer 
of the 82490XP. SNPADS# is then asserted and the 
snoop is written out. 

6.9.6.3 Memory Bus Controller Responsibility 

The MBC recognizes a write cycle with CADS# and 
CW/R# (or SNPADS# for snoop cycles). If 
MCACHE# is active, the MBC knows the cycle is a 
write back cycle, otherwise it is a CPU-generated 
cycle. 

CPU-generated write cycles are written to the main 
memory bus so that other caches can invalidate 
their copies of this information. The other caches do 
this by snooping with SNPINV active during snoop 
initiation if they detect a write cycle on the bus. 

Once the MBC detects CDTS# active, the data will 
be available for writing in the next clock in the appro- 
priate 82490XP buffer. The MBC should assert 
MSEL# so bursting is enabled, and burst through 
the write using MBRDY# (or MOSTB). MSEL# acti- 
vation also caused MZBT# to be sampled. MZBT# 
must be inactive at this time if the data will be written 
according to CPU burst order. 

Once the write cycle is complete, MEOC# must be 
asserted to end the write cycle and switch to the 
next pending cycle. If this write cycle is turned into 
an allocation, MFRZ# is sampled with MEOC# to 
freeze the write data in the 82490XP. 

MEOC# simply switches buffers from the current 
one in use to the buffer of. the next pending cycle. 
CRDY# needs to be asserted to actually end the 
cycle and allow the 82495XP and 82490XP to dis- 
pose of the information. 

6.9.6.4 Write Allocation and Read for 
Ownership 

The 82495XP/82490XP supports write allocation. 
An allocation cycle is a read of a cache line caused 
by a write miss to the same location. In its simplest 
form, a write miss is written to memory, then the 
82495XP requests a line from that same location. 
Meanwhile, the CPU only sees the write cycle. 

Write allocation may only be done if PALLC# is ac- 
tive during CADS# of the write cycle. For the alloca- 
tion to occur, MKEN# must be returned active dur- 
ing KWEND# of the write cycle. The write cycle may 
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be an actual write or a "dummy" write. Dummy 
writes are write cycles that are terminated in the 
82495XP and 82490XP as if they were normal 
writes, but the data is not actually written to memory. 
This saves a data write to memory. 

During write allocation, the write cycle will progress 
like a normal write cycle except MKEN# must be 
active during KWEND# activation. If the write cycle 
is a dummy write, MFRZ# must be used with 
MEOC# so that the line filled data is read around 
the write data into the 82490XP buffer. The line fill 
cycle is like any other line fill cycle except the CPU 
doesn't get any data. If a dummy write was per- 
formed, DRCTM# must be asserted during 
SWEND# activation to fill the line to the M state, 
and any cache supplying the data must invalidate its 
copy. 

Using dummy write cycles and filling data to the M 
state from another cache or memory is called Read 
For Ownership. This is because ownership is being 
transferred. In read for ownership cycles, memory is 
avoided as much as possible. First, the dummy write 
cycle avoids memory. Second, a line fill is performed 
as a cache to cache transfer with DRCTM# assert- 
ed. All caches were snooped with invalidate to elimi- 
nate their copies. 

For allocation cycles, SWEND# is not sampled for 
the write portion of the allocation. 

6.9.7 READ CYCLES 

The CPU initiates all read cycles. These are usually 
line fills to the CPU and line fills to the 
82495XP/82490XP. The signal MCACHE# is output 
with CADS# to indicate whether this cycle may or 
may not be cacheable. If cacheable, MKEN# is re- 
turned by the MBC to ultimately determine cachea- 
bility. ; 

Read hit cycles are serviced by the cache without 
MBC intervention. The only read cycles seen by the 
MBC (except I/O or special) are read misses and 
locked read cycles. 

Read misses cause CADS# to be asserted at most 
two clocks after ADS # of the CPU read cycle. If 
cacheable, as determined from MCACHE#, the 
MBC will return 4 BRDYs back to the CPU and 4 or 8 
MBRDYs to the 82495XP/82490XP. If the transfer is 
non-cacheable, the i860 XP CPU LEN and CACHE # 
outputs indicate the number of transfers to be given 
to the CPU. MBRDY# need not be used in the 
transfer if only a single piece of data is required by 
the CPU. 



If the read cycle is cacheable, it may cause another 
cached line to be bumped out of the cache. This is 
called a replacement and, if modified, causes a write 
back cycle. While one of the 82490XP memory buff- 
ers is being filled for the line fill, the write back buffer 
is loaded. If the line fill turns out to be non-cache- 
able at the end of the transfer, the write-back buffer 
is discarded, and the line in the cache remains valid. 
Otherwise, CADS# will be generated after the read 
cycle so the write back can be performed. The write 
back need not happen immediately after the line fill 
since the write-back buffer is snoopable. 

All locked reads go to the memory bus. If the read is 
a cache hit to M', the 82495XP/82490XP will ignore 
the data that the MBC returns, and provide it from its 
array. Locked reads are not cacheable by the CPU 
or the 82495XP/82490XP. Snoop write-backs that 
are a result of a LOCKed read/write request must 
update memory. 

6.9.7.1 Memory Bus Controller Responsibility 

Once the MBC sees a read cycle on the memory 
bus, it must determine whether the read is cache- 
able or non-cacheable using MCACHE# and its own 
address decoding. If non-cacheable, the CPU ex- 
pects a number of transfers as determined by its 
LEN and CACHE # outputs. If cacheable, the CPU 
expects 4 transfers, and the cache expects 4 or 8 
(configuration dependent). 

MKEN# is sampled during KWEND# to determine 
cacheability. Before MKEN# is sampled, KEN# is 
active assuming cacheability for the CPU. MKEN# 
must be sampled 1 clock before the first BRDY# to 
make the cycle non-cacheable. 

Once the read cycle is given to the memory system, 
all 82495XP/82490XP caches snoop to see if they 
contain the data in modified form. If so, the MBC 
must abort the cycle in memory and receive the data 
directly from the 82495XP/82490XP that has it, or 
wait until that cache writes it to memory. If the data 
transfer avoids memory, ie goes cache to cache, 
DRCTM# must be asserted with SWEND# to place 
the line in the M' state and the cache giving the data 
must invalidate its copy. 

MSEL# is activated and MBRDY# (or MISTB) used 
to sample input data from the read cycle. Once 
CDTS# has been seen active, the CPU read data 
path is clear. BRDY# may be returned to the CPU 
sometime after each MBRDY# for each piece of in- 
put data (see MDATA setup to CLK). Once the 
transfer completes, MEOC# and CRDY# are as- 
serted to complete the cycle in the 82495XP/ 
82490XP. 
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6.9.8 I/O AND SPECIAL CYCLES 

I/O and special cycles (flush, etc) are decoded by 
the 82495XP and not posted. These cycles wait until 
all buffers have been written, and all cycles have 
been completed, before they cause CADS# asser- 
tion. The CPU waits until the special cycle ends with 
the MBC's BRDY# assertion before it continues. 

When the 82495XP/82490XP is performing a 
FLUSH or SYNC, many write back cycles are re- 
quired. These cycles look like ordinary write back 
cycles, and should be handled as such. FSIOUT# is 
active during these write back cycles, so when FSI- 
OUT# goes inactive the cycle is complete and the 
memory bus controller can supply BRDY# to the 
CPU. 



6.10 Different Bus Widths 

The 82490XP is capable of supporting either 64- or 
128-bit memory bus widths. Depending on the con- 
figuration, the 82490XP's CPU and I/O busses may 
be multiplexed. The following diagram shows how 
an i860 XP CPU may be connected to a 128-bit 
memory bus: 



i860 XP 
CPU 



82490XP 
15 



4 > D124-D127 

— ► D60-D63 



82490XP 
13 



--£ 



82490XP 
12 



< 



82490XP 
3 



82490XP 
2 



82490XP 
1 



• D68-D71 

• D4-D7 



240956-30 



Figure 6-18. 82490XP On Wide Bus 



In this example, the CPU port of the 82490XPs is in 
x4 mode and the memory bus port is in x8 mode. 
This allows all 128 bits of the memory bus to be 
multiplexed to the 64-bit CPU bus. 

For read cycles, each MBRDY# loads 8 bits into 
each 82490XP. This is 128-bits of data. It will take 2 
BRDY# assertions to load this into the CPU. The 
first BRDY# assertion loads the first 4 bits onto the 
CPU bus, and the next BRDY# assertion loads the 
remaining 4 bits. 

For a 64-bit write cycle, the data is available at the 
on the appropriate data bits. On the i860 XP CPU 
with a 128-bit bus, this is determined by CPU ad- 
dress bit A3. The other data bits are undefined. For 
write-back cycles, all 1 28 bits are available at once. 
MBRDY# assertion will strobe the next 128 bits on 
the memory bus. 



7.0 DETAILED PIN DESCRIPTIONS 

The following chapter provides a detailed descrip- 
tion of each pin of the 82495XP and 82490XP. The 
pins have been categorized by function. Each pin 
description has a heading which summarizes the 
most important aspects of the pin. The heading is 
organized as: 



Pin Name 

Name Meaning 

Pin Function 

I/O, 82495XP/82490XP/i860 

Signal Type 

Synchronous/Asynchronous 




XP CPU, (location) 



for example, 

CADS# 

Cache Address Strobe 

Indicates beginning of cache cycle 

Output from 82495XP (pin E3) Cycle Control Signal 

Synchronous to CLK 

Following the heading are three sections. The first 
section, Signal Description, provides information of 
what the signal does, how to use it, and in what 
modes it operates. The second section, When Sam- 
pled or When Driven, indicates all the exact places 
where the part samples the signal, generates the 
signal, or neither. The third section, Relation to Oth- 
er Signals, mentions the other signals that are af- 
fected by this signal, synchronization requirements, 
and shared pins. 
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All specific information about each pin is provided in 
this chapter. 

7.0.1 CONFIGURATION SIGNALS 

These signals are inputs to the 82495XP and 
82490XP that are sampled at RESET and alter the 
configuration and operation of the cache. 




Figure 7-1. Configuration input Setup and Hold 

Each set of configuration inputs may have different 
setup times, but all signals have the same hold time: 
The signals may be released on the CPU clock edge 
that RESET is detected inactive. There are some 
configuration signals that are strapping options and 
cannot change their value during 82495XP opera- 
tion. 



7.0.2 CPU BUS INTERFACE SIGNALS 

These pins comprise the interface between CPU 
and 82495XP/82490XP. The signals in this interface 
are not flexible; Chapter 10 addresses the use of 
these signals. The following are the CPU bus inter- 
face signals: 



SET0-SET10 


TAG0-TAG11 


CFA0-CFA6 


ADS# 


W/R# 


D/C# 


M/IO# 


HITM# 


LOCK# 


PWT 


PCD 


LEN 


BRDYC1 # 


KEN# 


AHOLD 


EADS# 


BE0-BE7# 


INV 


BOFF# 







The majority of these signals must be connected 
strictly between the i860 XP CPU and the 82495XP. 
However, a subset of these signals is needed by the 
MBC to decode the i860 XP CPU cycle in cases 
where the MBC provides BRDYs to the CPU. For 
these purposes the following signals must also be 
inputs to a latch controlled by the 82495XP's BLE# 
output: 



BE0#-BE7# 


CACHE # 


CTYP 


LEN 


PCD 


PCYC 


PWT 







7.0.3 82495XP/82490XP INTERFACE SIGNALS 

These pins comprise the interface between the 
82495XP and 82490XP. The 82495XP uses these 
pins to control the 82490XP and its buffers. The sig- 
nals in this interface are not flexible; Chapter 10 ad- 
dresses the use of these signals. The following are 
the 82495XP/82490XP interface signals: 

WRARR# WAY MAWEA# 

BUS# MCYC# WBWE#[LR1] 

WBA[SEC2] WBTYP[LR0] BRDYC2# 

BLAST# BOFF# 



SIGNAL DESCRIPTIONS 



7.1 BGT# 

Bus Guaranteed Transfer 

Signals 82495XP of memory bus controller's com- 
mitment to complete the bus cycle. 

Input to 82495XP (pin M4) Cycle Progress Signal 

Synchronous 

7.1.1 SIGNAL DESCRIPTION 

The 82495XP owns all bus cycles (initiated by 
CADS#) until the memory bus controller accepts 
ownership. During this time cycles may be aborted 
due to a snoop. The memory bus controller signals 
its acceptance of ownership by driving BGT# active 
into the 82495XP. Once BGT# is driven active, the 
memory bus controller is responsible for completing 
the cycle on the memory bus. CRDY# signals com- 
pletion of the cycle. 

Once BGT# is asserted, other devices may not per- 
form snoops into the 82495XP until the end of the 
snooping window, SWEND# activation. The snoop 
address is latched if SWEND is asserted between 
BGT# and SWEND#, but the snoop does not begin 
until after SWEND# is asserted. SNPCYC# will not 
be asserted until the snoop window ends with 
SWEND # asserted. The advantage of asserting 
BGT# early is that it allows the 82495XP to start 
inquiries to the CPU, load the write-back buffer, and 
progress forward in the CPU bus pipeline. The disad- 
vantage is that snooping of this 82495XP is now 
blocked until SWEND # is asserted. 

7.1.2 WHEN SAMPLED 

After the 82495XP asserts CADS#, it begins sam- 
pling BGT# until it is sampled active. 

BGT# is a "Don't Care" after it has been recog- 
nized for cycle N and prior to the assertion of 
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CADS# for cycle N + 1. In addition, BGT# is a 
"Don't Care" once a cycle started by CADS# is 
aborted by a snoop, until the cycle is restored by the 
re-issueing of CADS#. 

7.1.3 RELATION TO OTHER SIGNALS 

When implementing BGT# in the MBC the following 
rules should be used: 

1. BGT# must follow every assertion of CADS#, 
unless the cycle is aborted due to a snoop. 

2. It must preceed CRDY# (for line fills and alloca- 
tions BGT# must preceed CRDY# by at least 3 
CLKS). 

3. In addition BGT# must be asserted with or be- 
fore the assertion of KWEND# and SWEND#. 

4. BGT# must be asserted with or before the asser- 
tion of BRDY# by the MBC. 

5. BGT# is not required following the assertion of 
SNPADS#. 

6. BGT# must be asserted with or before MEOC# 
is asserted. 



7.2 BLE# 

BE Latch Enable 

Controls latching of i860 XP CPU's byte enable and 
cycle attribute signals 

Output of 82495XP (pin C16) Cycle Control Signal. 

Synchronous to CLK 

7.2.1 SIGNAL DESCRIPTION 

BLE# is used to control the enable line of an exter- 
nal latch (clock edge triggered '377 type). This latch 
is used to capture the i860 XP CPU's byte enables 
(BE0#-BE7#) and CPU cycle attribute signals 
which do not go through the 82495XP. The 82495XP 
manages the opening and closing of this latch: when 
BLE# is active, new values from the CPU enter the 
latch at each rising edge of CLK. 

The 82495XP latches the byte enables after ADS# 
of a memory bus bound cycle. It relatches this infor- 
mation with CRDY# or CNA# of that cycle if anoth- 
er cycle is pending. 

7.2.2 WHEN DRIVEN 

The 82495XP latches the BE latch signals 1 clock 
after ADS# of a memory-bound cycle. Thus latched 
BE0#-BE7# are valid with CADS#. The 82495XP 
opens, then closes this latch if a cycle is pending 
and CNA# or CRDY# is asserted. Thus latched 
BE0#-BE7# are valid two clocks after CNA# or 



CRDY#, which is one clock AFTER CADS # for 
back-to-back cycles. The signals latched in the BE 
latch are only valid for CPU generated memory bus 
cycles (ie, not a 82495XP generated writeback or 
allocation). 

7.2.3 RELATION TO OTHER SIGNALS 

The following CPU signals must be latched in the BE 
latch: 



BE0#-BE7# 


CACHE # 


CTYP 


LEN 


PCD 


PCYC 


PWT 







All other signals in the 82495XP to CPU interface 
(listed in sec. 7.0.2) must be connected only be- 
tween the i860 XP CPU and the 82495XP. 



7.3 BRDY# 

Burst Ready 

Memory Bus Controller Burst Ready input to 
82495XP, 82490XP, and i860 XP CPU 

Input to 82495XP and 82490XP (82495XP pin P1, 
82490XP pin 60) Cycle Progress Signal 

Input to i860 XP CPU (BRDY2#, pin U1) 

Synchronous to CLK 

7.3.1 SIGNAL DESCRIPTION 

The BRDY# input to both the 82495XP and 
82490XP must be connected to the BRDY# signal 
which the MBC is providing to the i860 XP CPU's 
BRDY2# pin. The signal is used by the 82495XP for 
burst tracking purposes. In the 82490XP, it incre- 
ments the CPU latch burst counter. 

During CPU read cycles, BRDY# allows the next 32 
or 64-bit slice of read data to be available at the 
82490XP's CDATA outputs (CPU bus) by advancing 
the CPU latch burst counter. At the same time, 
BRDY# is latching the previous slice of data into the 
i860 XP CPU. Refer to chapter 6 for more details. 

During CPU write cycles, BRDY# is used to latch 
each slice of write data into the CPU latches and 
advance the latch counter. 

During CPU special and I/O cycles (which are not 
posted) BRDY# is used to end the cycle. 

BRDY# must not be asserted until the bus is grant- 
ed (BGT# asserted) and until the data path is ready 
for transferring (CDTS# is asserted). 
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7.3.2 WHEN SAMPLED 

BRDY# is sampled by the CPU, the 82495XP, and 
the 82490XP at every CLK edge. It must always 
meet proper setup and hold times to CLK. Even 
though the CPU latch may not be in use, BRDY# 
assertion will still advance the latch counter. 



7.3.3 RELATION TO OTHER SIGNALS 

BRDY# controls the CPU and 82490XP CPU latch- 
es. BRDY# has the following implication rules: 

1. The last BRDY# for cycle N must be asserted 2 
clocks before MEOC# for cycle N + 1. ■ 

2. BRDY# :> BGT# 

3. BRDY# > CDTS# 



7.4 C490LDRV 

82490XP Low Drive Buffer 

Selects the 82495XP low capacitance driving buffers 
Input to 82495XP (pin M3) Configuration Signal 
Synchronous to CLK 

7.4.1 SIGNAL DESCRIPTION 

C490LDRV selects the driving strength of the 
82495XP buffers that interface to the 82490XP. Re- 
fer to the layout specifications for information how 
C490LDRV should be connected. 

7.4.2 WHEN SAMPLED 

C490LDRV is a configuration input sampled like Fig- 
ure 7-1. C490LDRV requires a setup time of 4 CPU 
clocks. After sampling, C490LDRV is a "don't care" 
until it is sampled as the BGT# pin after the first 
CADS# assertion. 



7.4.3 RELATION OT OTHER SIGNALS 

C490LDRV shares a pin with BGT#. 

7.5 CADS# 

Cache Address Strobe 

Indicates beginning of cache cycle 

Output from 82495XP (pin E3) Cycle Control Signal 

Synchronous to CLK 

7.5.1 SIGNAL DESCRIPTION 

CADS' # requests the execution of a memory bus 
cycle to the MBC, and indicates that the cycle attri- 
butes (ie. CD/C#, CM/IO#, CW/R#, PALLC#, 
etc.) are valid. 

If the 82495XP receives a snoop hit to an [M] state 
line before BGT# is asserted by the MBC, the cur- 
rent CADS # is aborted and reissued after the snoop 
has completed. If the current line (issued by the 
stalled CADS#) is invalidated by the snoop, then 
that CADS'# is cancelled ( ie. will not be reissued 
after the snoop is completed). 

CADS# is a glitch-free signal. 

7.5.2 WHEN DRIVEN 

CADS# is asserted by the 82495XP for exactly one 
CLK, and is always a valid logic level. 

7.5.3 RELATION TO OTHER SIGNALS 

CADS#, when asserted, indicates that the cache 
cycle control and attribute signals (ex. CD/C#, 
NENE#, CW/R#, etc.) are valid. 

Since allocations do not require BRDY#s to the 
CPU, the CDTS# of an allocation cycle will always 
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occur with CADS# of the allocation. In normal cy- 
cles the 82495XP will generate CADS# followed by 
CDTS#. 

CADS# = = CDTS# for all write-through cycles. 

Once CADS # is active, PALLC#, CWAY, CDTS#, 
and BUS# are valid. Address and cycle specifica- 
tion signals (MSET0-MSET10, MTAG0-MTAG11, 
MCFA0-MCFA6, CW/R#, CM/IO#, CD/C#, 
RDYSRC, MCACHE#, NENE#, SMLN#, KLOCK#, 
and CPLOCK#) will be valid with CADS# active as 
well. 

Every CADS# initiated cycle requires a BGT# and 
CRDY# input from the MBC. 

CADS# and SNPADS# will never be asserted on 
the same CLK. 



7.6 CAHOLD 

82495XP AHOLD Output 
Self-test result and AHOLD output status 
Output of 82495XP (pin G4) Test Signal 
Synchronous to CLK 

7.6.1 SIGNAL DESCRIPTION 

CAHOLD has two functions. One, it indicates the re- 
sult of the built-in self-tests of the 82495XP. Two, it 
represents the 82495XP AHOLD into the i860 XP 
CPU. 

The 82495XP drives CAHOLD after the 82495XP 
self-tests have completed. CAHOLD should be 
latched when FSIOUT# goes inactive after reset. If 
CAHOLD is high, the self-tests have passed, other- 
wise they have failed. 

When the 82495XP drives AHOLD to the i860 XP 
CPU, it also drives CAHOLD, thus providing a means 
of tracking inquire cycles and back invalidations for 
performance monitoring. 

7.6.2 WHEN DRIVEN 

CAHOLD is always at a valid logic level. During self- 
test, CAHOLD is held until the clock edge that FSI- 
OUT# is sampled inactive. After self-test, or reset, 
CAHOLD is asserted whenever the 82495XP as- 
serts AHOLD. 



7.6.3 RELATION TO OTHER SIGNALS 

CAHOLD reflects the value of AHOLD except during 
self-test. During self-test, the value of CAHOLD 
should be latched with the falling edge of FSIOUT# 
to determine pass/fail. 



7.7 CD/C# 

Cache Data/Code 

Indicates whether current cycle is Code or Data 
Output from 82495XP (pin D3) Cycle Control Signal 
Synchronous to CLK 

7.7.1 SIGNAL DESCRIPTION 

CD/C#, along with CW/R# and CM/IO#, is a 
82495XP cycle definition signal. It indicates the type 
of bus cycle being requested of the MBC. CD/C# 
can be pipelined by the memory bus controller (by 
using the CNA# input to the 82495XP). 

7.7.2 WHEN DRIVEN 

CD/C# is valid in the same CLK as CADS# and 
remains valid until CRDY# or CNA#. C/DC# is al- 
ways a valid logic level. 

7.7.3 RELATION TO OTHER SIGNALS 

Address and cycle specification signals (MSETO- 
MSET10, MTAG0-MTAG11, MCFA0-MCFA6, 
CW/R#, CM/IO#, CD/C#, RDYSRC, MCACHE#, 
NENE#, SMLN#, KLOCK#, and CPLOCK#) will be 
valid with CADS #. 

■7.8 CDATA0-CDATA7 

CPU Data Bus Connection 

Data Bus Connection from 82490XP to CPU 

Input/Output to 82490XP (pins 48, 54, 49, 55, 46, 
51,52, 57) 

Isolated Interface 
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7.8.1 SIGNAL DESCRIPTION 

CDATAO-7 is the 82490XP data bus connection to 
the CPU. All or part of these 8 pins will be used in 
connecting the 82490XP to the CPU depending on 
the cache configuration. See layout information for 
details. 



7.9 CDTS# 

Cache Data Strobe 

Indicates availability of CPU data/data bus 
Output from 82495XP (pin F4) Cycle Control Signal 
Synchronous to CLK 

7.9.1 SIGNAL DESCRIPTION 

For read cycles, CDTS#, when asserted, indicates 
that in the next CPU clock the data bus path is avail- 
able. This is the earliest time in which BRDY# may 
be supplied to the CPU. For CPU initiated write cy- 
cles, it indicates that the data is available on the 
memory bus. For i860 XP CPU inquire cycles, 
CDTS# informs the MBC that the last piece of in- 
quire data is valid on the CPU bus. 

Usage of this signal allows complete independence 
between address strobes (CADS# and SNPADS#) 
and data strobe. CDTS# allows the 82495XP to sig- 
nal the MBC that a new cycle has begun as soon as 
addresses are available. This allows memory bus cy- 
cles to start before data is ready to be given/taken. 

CDTS# is a glitch-free signal. 



7.9.2 WHEN DRIVEN 

CDTS# is asserted for one CLK, at the same time or 
later than CADS# for any given cycle. 

7.9.3 RELATION TO OTHER SIGNALS 

When the MBC samples CDTS# asserted, it can 
begin providing BRDY#s for the read cycle to the 
CPU in the next CLK. CDTS# must always be as- 
serted before CRDY# and must be asserted prior to 
the first BRDY#. 

The CDTS# of an allocation will always occur with 
CADS# of the allocation. In normal cycles the 
82495XP will generate CDTS# following CADS#. 

CDTS# will be asserted at least one CLK after 
SNPADS#. 



7.10 CFG0-CFG2 

Configuration Pins 

Determine Cache Characteristics 

Input to 82495XP (pins L4, Q1, M4,) Configuration 
Signals 

Synchronous to CLK 

7.10.1 SIGNAL DESCRIPTION 

CFG0-CFG2 are the 3 cache configuration inputs 
that determine cache characteristics such as line ra- 
tio, tag size, and lines per sector. During RESET, this 
information is passed on to the 82490XPs. The fol- 
lowing table maps CFG0-CFG2 to their respective 
configurations for the i860 XP CPU: 



Config 
No. 


Line 
Ratio 


Lines/ 
Sector 


No.of 
Tags 


CFG2 


CFG1 


CFGO 


1 


.1 


1 


8K 








1 


2 


2 


1 


4K 


1 


1 


1 


3 


1 


2 


8K 











4 


2 


'■ 1 '" 


8K 





1 


1 


5 


4 


1 


4K 


1 


1 
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7.10.2 WHEN SAMPLED 

CFG0-CFG2 are sampled like Figure 7-1 with a set- 
up time of at least 10 CPU clocks. After sampling, 
CFGO, CFG1, and CFG2 become cycle progress in- 
put signals to the 82495XP and are sampled after 
CADS# of the first cycle. 



7.13 CNA#[CFG0] 

82495XP Next Address Enable 

Dynamically pipelines CADS# cycles 

Input to 82495XP (pin L4) Cycle Progress Signal 

Synchronous to CLK 



7.10.3 RELATION TO OTHER SIGNALS 

CFGO shares a pin with CNA#, CFG1 shares a pin 
with SWEND#, and CFG2 shares a pin with 
KWEND#. 



7.11 CLK 

i860 XP CPU, 82495XP, 82490XP Clock 
Input to the 82495XP (D11) 

7.11.1 SIGNAL DESCRIPTION 

The CLK input determines the execution rate and 
timing of the 82495XP, 82490XP, and CPU. Pin tim- 
ings are specified relative to the rising edge of this 
signal. The i860 XP CPU, 82495XP, and 82490XP 
requires TTL levels on CLK for proper operation. 



7.13.1 SIGNAL DESCRIPTION 

CNA# is used by the MBC to dynamically pipeline 
CADS# cycles. When active it indicates to the 
82495XP that the next MBC request can be started. 
Only one level of pipelining is allowed in the 
82495XP. 

CNA# is an optional input for all cycles initiated with 
CADS#. 



7.13.2 WHEN SAMPLED 

CNA# is sampled starting in the first CLK in which 
BGT# is sampled active until CRDY# is sampled 
active. CNA# is then ignored until the BGT# of the 
next cycle. 

CNA# is ignored during snoop write-back cycles. 




7.12 Ci\fl/IO# 

Cache Memory/IO 

Indicates whether current cycle is Memory or IO 
Output from 82495XP (D4) Cycle Control Signal 
Synchronous to CLK 



7.13.3 RELATION TO OTHER SIGNALS 

Once the 82495XP samples this signal active, it is- 
sues the CADS# for the next memory bus cycle as 
soon as one begins. 

CNA# is recognized between BGT# and CRDY# 
or CDTS# and CRDY# of a given cycle. 



7.12.1 SIGNAL DESCRIPTION 

CM/IO#, along with CW/R# and CD/C#, is a 
82495XP cycle definition signal. It indicates the type 
of bus cycle being requested of the MBC. CM/IO# 
can be pipelined by the memory bus controller 
(CNA# input to the 82495XP). 



7.12.2 WHEN DRIVEN 

CM/IO# is valid in the same CLK as CADS #, and 
remains active until CRDY# or CNA#. 



7.12.3 RELATION TO OTHER SIGNALS 

Address and cycle specification signals (MSET0- 
MSET10, MTAG0-MTAG11, MCFA0-MCFA6, CW/ 
R#, CM/IO#, CD/C#, RDYSRC, MCACHE#, 
NENE#, SMLN#, KLOCK#, and CPLOCK#) will be 
valid with CADS# assertion. 



7.14 CRDY# 

Cache Ready 

Ends a cycle in the 82495XP/82490XP 

Input to 82495XP and 82490XP (pins M2, 43) Cycle 
Progress Signal 

Synchronous to CLK 

7.14.1 SIGNAL DESCRIPTION 

CRDY# is used by the 82495XP and 82490XP to 
end a memory bus cycle. CRDY# indicates full com- 
pletion of the cycle and allows the 
82495XP/82490XP to free internal resources for the 
next cycle. In the 82490XP, this means that the cur- 
rent memory buffer in use is emptied (put in array, 
discarded, etc). In the 82495XP, CRDY# assertion 
allows 82495XP cycle progress signals (BGT#, 
KWEND#, SWEND#) to be sampled for the next 
cycle if pipelining is used. 
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CRDY# is required for all 82495XP/82490XP mem- 
ory bus cycles, including snoop cycles. CRDY# 
must be asserted to the 82495XP and 82490XP at 
the same time. 



7.14.2 WHEN SAMPLED 

CRDY# for a given cycle is ignored until KWEND# 
is returned for that cycle. If KWEND# is not required 
for the cycle, CRDY# is ignored until BGT#. When 
CRDY# is ignored, it may violate setup and hold 
times. 



7.15.1 SIGNAL DESCRIPTION 

CWAY is a cycle definition signal which indicates to 
the MBC the WAY used by the requested cycle. On 
line-fills it indicates the way the line will be loaded. 
For write-hits (to [S] state or LOCKed) it indicates 
the way which was a hit. For write-backs it indicates 
the way that was written-back. 

CWAY is utilized by external tracking machines in 
order for the 82495XP tags to be accurately dupli- 
cated. 



7.14.3 RELATION TO OTHER SIGNALS 

CRDY# must be sampled by the 82495XP and 
82490XP at the same time. For the 82495XP, 
CRDY# has many cycle implication rules: 

1. CRDY# > CDTS# 

2. CRDY# > BGT# 

3. CRDY# > BGT# 4- 2 clocks if cycle is a line-fill 
or allocation 

4. CRDY# ->' KWEND# if cycle is a line-fill or write- 
through with potential allocation (PALLC# = 0) 

For the 82490XP, CRDY# has three basic rules: 

1 . MEOC# for cycle N must be sampled with or be- 
fore CRDY# for cycle N. 

2. MEOC# for cycle N + 1 must be sampled at least 
2 CPU clocks after CRDY# for cycle N. 

3. CRDY# for cycle N + 1 must be after the last 
BRDY# for cycle N. 

MBRDY# fills the current 82490XP memory buffer. 
CRDY# emties this buffer and makes it available for 
new cycles. CRDY# may be asserted on the same 
clock as MEOC# which may be asserted on the 
same clock as MBRDY#. 

CRDY# shares a pin with SLFTST#. 



7.15 CWAY 

Cache Way 

Indicates WAY used by the current cycle 

Output from 82495XP (pin J3) Cycle Control Signal 

Synchronous to CLK 



7.15.2 WHEN DRIVEN 

CWAY is valid together with CADS# and remains 
valid until CRDY# or CNA#. 



7.15.3 RELATION TO OTHER SIGNALS 

CWAY is valid with CADS # . 

7.16 CW/R# 

Cache Write/ Read 

Indicates whether current cycle is write or read 
Output from 82495XP (pin E4) Cycle Control Signal 
Synchronous to CLK 

7.16.1 SIGNAL DESCRIPTION 

CW/R#, along with CD/C# and CM/IO#, is a 
82495XP cycle definition signal. It indicates the type 
of bus cycle being requested of the MBC. CW/R# 
can be pipelined by the memory bus controller 
(CNA# input to the 82495XP). 

7.16.2 WHEN DRIVEN 

CW/R# is valid in the same CLK as CADS# and is 
valid until CRDY# or CNA#. 

7.16.3 RELATION TO OTHER SIGNALS 

Address and cycle specification signals (MSET0- 
MSET10, MTAG0-MTAG11, MCFA0-MCFA6, 
CW/R#, CM/IO#, CD/C#, RDYSRC, MGACHE#, 
NENE#, SMLN#, KLOCK#, and CPLOCK#) will be 
valid with CADS#. 
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7.17 DRCTM# 

Memory Bus Direct to [M] State 

Signals 82495XP to tag data direct to the [M] state, 
skipping the [E] and [S] states. 

Input to the 82495XP (pin M1) Cycle Attribute Signal 

Synchronous to CLK 

7.17.1 SIGNAL DESCRIPTION 

DRCTM# is an input to the 82495XP from the mem- 
ory bus. When sampled active at the end of the 
snooping window (SWEND# activation), the 
82495XP moves the line fill in progress directly to 
the [M] state. 

There are three cases in which this is useful. 

1 . Simplifies External State Tracker 

External trackers can only track the [M], [S], and 
[I] states. The [E] state can not be tracked exter- 
nally since cache write hits internally change [E] 
state lines to [M] state. DRCTM# can be used to 
eliminate the [E] state from the MESI protocol. 

2. Read For Ownership 

During a write miss with allocation the write may 
go to the memory buffer and not be written to 
memory. A read from memory, in conjunction with 
the MFRZ# signal asserted, reads the data to fill 
around the bytes written by the CPU. The con- 
tents of the memory buffer are then entered into 
the cache. The cache would normally tag this 
data in the [E] state (The cache assumes the 
write went to main memory). The system has the 
option of never completing the write to memroy 
(increases performance by completing the alloca- 
tion quicker). If the write is not performed to 
memory, the cache is the only owner of the new 
data and therefore the cache entry must be 
tagged to the [M] state. 

3. Cache to Cache Transfer 

A cache to cache transfer may occur as a result 
of a snoop. For example, if CPU/Cache 1 per- 
forms a read from main memory and CPU/Cache 
2 flags it as a snoop hit to an [M] state line. To 
expedite the transfer, the system may perform 
the writeback from CPU/Cache 2 directly to 
CPU/Cache 1, bypassing memory. CPU/Cache 1 
assumes the write-back went to memory and 
would normally tag the line to the [S] state. Since 
the system did not perform the write to memory, 
the system should drive DRCTM# to force the 
line to the [M] state. In addition, the line should 
be invalidated in CPU/Cache 2 by driving 
SNPINV. 



7.17.2 WHEN SAMPLED 

DRCTM# is synchronous to CLK. It is only sampled 
when SWEND# is active (the end of the snooping 
window). When SWEND# is inactive DRCTM# is 
ignored and does not have to meet setup and hold 
times. 



7.17.3 RELATION TO OTHER SIGNALS 

DRCTM# (direct to [M]) and MWB/WT# (write poli- 
cy) combine to define the memory bus attributes and 
are sampled on CLK at the end of the snooping win- 
dow (SWEND# activation). 

If MRO# is sampled active during KWEND#, 
DRCTM# is ignored. 



7.18 FLUSH # 

Flush 

Causes a 82495XP Cache Flush 

Input to 82495XP (N4) Cache Synchronization Sig- 
nal 

Asynchronous input 

7.18.1 SIGNAL DESCRIPTION 

This signal causes the 82495XP to flush all its modi- 
fied lines to main memory. The flushing of modified 
lines require the 82495XP to perform back-invalida- 
tion and inquire cycles to the CPU. At the end of 
flush, the 82495XP tag array will be completely inval- 
idated. 

FLUSH # will invalidate the entire 82495XP tag ar- 
ray. It takes two clocks to look-up and invalidate a 
tag entry. The 82495XP will also invalidate tags in 
the CPU cache by running back-invalidation cycles. 
If the 82495XP tag state is modified, the 82495XP 
will run inquire cycles to the i860 XP CPU to see is 
the line is modified in its cache. If so, the i860 XP 
CPU will write back the line into the 82495XP write 
buffer. All modified 82495XP cache lines must be 
written to memory. 

7.18.2 WHEN SAMPLED 

FLUSH # can be asserted at any time. The 82495XP 
will complete all outstanding transactions on the 
CPU and memory bus before beginning the 
FLUSH # process. The memory bus controller does 
not have to prevent FLUSH # during locked cycles 
because the 82495XP will complete its locked trans- 
action before the FLUSH# process will begin. 
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Once a FLUSH # operation has begun, the FLUSH # 
signal is ignored until the operation completes. If 
RESET is activated while the FLUSH # operation is 
in progress, the FLUSH # operation will be aborted 
and the RESET immediately executed. 

FLUSH # is an asynchronous input. FLUSH # must 
have a pulse width of 2 CLK's in order to guarantee 
82495XP recognition. 

7.18.3 RELATION TO OTHER SIGNALS 

To initiate a FLUSH #, the 82495XP will complete all 
pending cycles and prohibit the processor from issu- 
ing any further ADS#'s while the FLUSH # is in 
progress. The FSIOUT# output signal is used to in- 
dicate the start and end of the FLUSH # operation. It 
will become active when the FLUSH # signal is inter- 
nally recognized (all outstanding cycles have com- 
pleted) and will de-activate with the CRDY# of the 
last FLUSH # write-back. 

The memory bus controller supplies BRDY# to the 
CPU once FSIOUT# has gone inactive and the 
FLUSH is complete. Once FLUSH # has begun, and 
FSIOUT# active, all CADS#'s and CRDY#'s corre- 
spond to write-backs caused by the FLUSH # opera- 
tion. 

The 82495XP can be snooped during FLUSH # cy- 
cles and the snooping protocols will be the same as 
that for any memory bus cycle. 



7.19 FPFLD# [FPFLDEN] 

External FIFO PFLD 

Indicates PFLD cycle during external PFLD FIFO 
mode 

Output of the 82495XP (J4) Cycle Control Signal 

SynctoCLK 



Mode 


FPFLDEN 


NCPFLD# 


1 





1 


2 








3 


1 


1 


Illegal Mode 


1 






If mode 3 has been selected, the 82495XP allows 
the PFLD pipeline to be extended with an external 
FIFO. After RESET, when this mode has been se- 
lected, the FPFLD output will indicate that the re- 
quested cycle is a PFLD cycle. See Section 5.2.5 for 
more details. 



7.19.2 WHEN DRIVEN 

FPFLDEN is sampled on RESET as in figure 7-1, 
with a setup time of 4 CPU clocks. In PFLD mode 
#3, the FPFLD# output is valid in the same CLK as 
CADS# and remains valid until CRDY# or CNA#. 



7.19.3 RELATION TO OTHER SIGNALS 

Address and cycle specification signals (MSETO- 
MSET10, MTAG0-MTAG11, MCFA0-MCFA6, 
CW/R#, CM/IO#, CD/C#, RDYSRC, MCACHE#, 
NENE#, SMLN#, KLOCK#, and CPLOCK#) will be 
valid with CADS #. 



7.20 FSIOUT# 

Flush, Sync, Initialization Output 

Indicates the start and end of the Flush, 

Sync, and Initialization operations. 

Output of the 82495XP (D1) Cache Synchronization 
Signal 

Sync to CLK 



7.19.1 SIGNAL DESCRIPTION 

During RESET, this pin functions as the FPFLDEN 
configuration signal. The 82495XP can be config- 
ured to decode the i860 XP microprocessor's PFLD 
cycles. The 82495XP supports 3 operational modes 
for PFLD cycle decoding, as defined by FPFLDEN 
andNCPFLD#: 

Mode #1. PFLD cycles are cached in the 82495XP. 

Mode #2. PFLD cycles are not cached in the 
82495XP, without an external PFLD ex- 
tension FIFO. 



Mode 



#3. PFLD cycles not cached in the 82495XP, 
with an external PFLD extension FIFO. 



7.20.1 SIGNAL DESCRIPTION 

This signal indicates the start and the end of either a 
Flush, Sync, or Initialization (including self-test if re- 
quested) operation. These operations are mutually 
exclusive. This signal is activated when the 82495XP 
begins the operation and goes inactive upon com- 
pletion of the operation. 



7.20.2 WHEN DRIVEN 

This signal will be asserted whenever a Flush, Sync, 
or Initialization operation is internally recognized by 
the 82495XP and is in progress. 
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7.20.3 RELATION TO OTHER SIGNALS 

FSIOUT# active indicates that either Flush, Sync, or 
Initialization operation is in progress. Only one of 
these operations can be run within the 82495XP at a 
time. 

The table below shows the priorities of these three 
operations: 



Operation 


Trigger 


Priority 


Initialization 


RESET 


Highest 


Flush 


FLUSH # 




Sync 


SYNC# 


Lowest 



If a trigger of higher priority occurs while a lower 
priority operation is running, the lower priority opera- 
tion is aborted and the higher priority one executed. 
If a trigger of lower priority occurs when a higher 
priority one is running, the lower priority trigger is 
ignored. Once a FLUSH # or SYNC# operation has 
begun, its trigger is ignored until the operation com- 
pletes. 

When a higher priority operation aborts a lower prior- 
ity one, FSIOUT# remains active. 

Since RESET, FLUSH # and SYNC# are all asyn- 
chronous, FSIOUT# will be activated when the 
82495XP is actually internally executing the opera- 
tion. 



7.21 HIGHZ# 

High Impedance Outputs 
Causes 82495XP outputs to be tristated 
Input to 82495XP (pin P4) Test Signal 
Synchronous to CLK 

7.21.1 SIGNAL DESCRIPTION 

The 82495XP will enter self-test if both SLFTST# is 
active and HIGHZ# is inactive during reset. If 
SLFTST# is sampled active and HIGHZ# is sam- 
pled active during reset, the 82495XP floats all its 
outputs until the 82495XP is reset again. Activation 
of HIGHZ# without SLFTST# does nothing. 

7.21.2 WHEN SAMPLED 

HIGHZ# is sampled like figure 7-1 with a setup time 
of 10 CPU clocks. HIGHZ# is then a don't care until 
the 82495XP reset sequence is complete (with FSI- 
OUT# going inactive) where it becomes the MBALE 
pin. 



7.21.3 RELATION TO OTHER SIGNALS 

HIGHZ# shares a pin with MBALE. 82495XP out- 
puts are tristated if both HIGHZ# and SLFTST# are 
sampled active during reset. 

7.22 KLOCK# 

82495XP LOCK# 

Request to MBC of LOCKed cycle 

Output from 82495XP (pin C3) Cycle Control Signal 

Synchronous to CLK 

7.22.1 SIGNAL DESCRIPTION 

KLOCK# indicates to the MBC that there is a re- 
quest to execute a locked cycle. This signal follows 
the CPU lock request. 

KLOCK# is simply a one-clock flow-through version 
of the CPU LOCK# signal. The 82495XP will acti- 
vate KLOCK# with CADS# of the first cycle of a 
LOCKed operation and it will remain active until the 
CADS# of the last cycle of the LOCKed operation. 

Note that if the memory bus is pipelined, there may 
be a situation in which KLOCK# deactivation is in 
the same CLK as its new activation (together with 
CADS#). In this case KLOCK# won't go inactive 
between back-to-back locked sequences. KLOCK# 
will never go inactive if the CPU LOCK# does not go 
inactive. The 82495XP will not open arbitration win- 
dows between back-to-back locked sequences; it is 
the memory bus controller's responsibility to imple- 
ment this functionality by detecting a LOCKed write 
followed by a LOCKed read. 

KLOCK# activation is not qualified by the tag array 
look-up (hit/miss indications); therefore, KLOCK# 
can be active before CADS# is asserted. 

7.22.2 WHEN DRIVEN 

KLOCK# assertion is a flow-through of 1 CLK from 
the CPU LOCK# after the 82495XP completes all 
pending cycles. KLOCK# deassertion is a flow- 
through of 1 CLK from the CPU LOCK# signal, and 
must be at least 1 CLK after the last CADS # of a 
LOCKed sequence. KLOCK# is always driven to a 
valid logic level. 

7.22.3 RELATION TO OTHER SIGNALS 

Address and cycle specification signals (MSET0- 
MSET10, MTAG0-MTAG11, MCFA0-MCFA6, CW7 
R#, CM/IO#, CD/C#, RDYSRC, MCACHE#, 
NENE#, SMLN#, KLOCK#, and CPLOCK#) will be 
valid with CADS #. ) 
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7.23 KWEND# 

Cacheability Window End 

Closes 82495XP Cacheability Window 

Input to 82495XP (pin M4) Cycle Progress Signal 

Synchronous to CLK 



7.24 MALE 

Memory Address Latch Enable 
Tristates/Enables Memory Address Outputs 
Input to 82495XP (pin 02) Cycle Control Signal 
Asynchronous 



7.23.1 SIGNAL DESCRIPTION 

KWEND# is a cycle progress input to the 82495XP 
that, when active, closes the cacheability window 
and causes the cacheability attributes MKEN# and 
MRO# to be sampled. 

KWEND# is sampled by the 82495XP after BGT# 
has been sampled active. KWEND# should be as- 
serted by the MBC once the memory address has 
been decoded and cacheability (MKEN#) and read- 
only (MRO#) attributes have been determined. 

The sampling of KWEND# active allows SWEND# 
to be sampled. Resolving KWEND# quickly allows 
the non-cacheable window between BGT# and 
SWEND# to be closed more quickly. KWEND# ac- 
tivation also allows the 82495XP to start allocations 
and begin replacements. 



7.23.2 WHEN SAMPLED 

KWEND# is sampled by the 82495XP oh the clock, 
or after, BGT# has been sampled active. Once 
KWEND# is sampled active it is not sampled again 
until BGT# of the next cycle. KWEND# need not 
follow setup and hold times if it is not being sampled. 

BGT#, KWEND# and SWEND# may be asserted 
on the same clock edge. 

KWEND# need only be activated for those cycles 
which require the sampling of MKEN# and MRO#. 
These are line-fills and write cycles with potential 
allocation. 



7.23.3 RELATION TO OTHER SIGNALS 

KWEND# is sampled on or after BGT# and allow$ 
the sampling of SWEND#. KWEND# activation 
causes the sampling of MKEN# and MRO#. 

According to cycle progress implication rules, 
CRDY# must be at least one clock after KWEND# 
for line fills and write-through cycles with potential 
allocate. 

KWEND# shares a pin with CFG2. 



7.24.1 SIGNAL DESCRIPTION 

The 82495XP contains an address latch which con- 
trols the last stage of the 82495XP address output. It 
is controlled by four signals: MAOE#, MBAOE#, 
MALE, and MBALE. The signals MALE and MBALE 
control the latching of the entire 82495XP address 
where MBALE controls the subline portion and 
MALE controls the rest. 

MALE is provided so that the memory bus controller 
can control when the next pipelined address is driv- 
en. With MALE high, the 82495XP address latch is in 
'flow-through' mode and the 82495XP address is 
available at the memory bus. Changes in the 
82495XP address are seen immediately at the mem- 
ory bus. When MALE is driven low the address at 
the latch input is latched. Any subsequent address 
driven by the 82495XP will not be seen at the memo- 
ry bus outputs until MALE is driven high again. 

MALE will latch 82495XP addresses regardless of 
the state of MAOE#. If MAOE# is inactive, MALE 
will still operate the latch properly, but the memory 
bus will be tristated. 



7.24.2 WHEN SAMPLED 

MALE is asynchronous and can be asserted and 
deasserted at any time. MALE should always be 
driven to a valid state since it directly controls the 
operation of the address latch. 

7.24.3 RELATION TO OTHER SIGNALS 

MALE together with MBALE control the latching of 
the entire 82495XP output address. The other latch 
control signals, MAOE# and MBAOE#, provide the 
memory bus controller complete command over the 
address outputs. MAOE# and MBAOE# do not af- 
fect the operation of MALE or MBALE. 

MALE shares a pin with the WWOR# configuration 
pin. 
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7.25 MAOE# 

Memory Address Output Enable 
Tristates/ Enables Memory Address Outputs 
Input to 82495XP (pin S4) Cycle Control Signal 
Asynchronous except during snoop cycles 

7.25.1 SIGNAL DESCRIPTION 

The 82495XP has an address latch which is con- 
trolled by a latch input, MALE, and an output enable 
input, MAOE#. MAOE# has two main functions. 
One, driving MAOE# active will enable the 82495XP 
to drive it's address lines MTAGO-11, MSETO-10, 
and MCFAO-6. Two, MAOE# is a qualifier for snoop 
cycles and must be inactive for the 82495XP to 
snoop. 

In general, MAOE# should be active if its 82495XP 
is the current bus master. When that 82495XP gives 
up the bus, MAOE# should be inactive to float the 
address lines and allow another master to snoop. 

MAOE# controls the output of the 82495XP ad- 
dress except the subline (burst) portion. This portion 
has a separate output control: MBAOE#. 

7.25.2 WHEN SAMPLED 

MAOE# is an asynchronous input (except during 
snoop cycles) and always has full control over the 
address output. For this reason, MAOE# must al- 
ways be driven to a valid state. 

The 82495XP does, however, sample MAOE# dur- 
ing snoop cycles. When sampled, MAOE# must 
meet proper setup and hold times. In synchronous 
snoop mode MAOE# is sampled on a CLK edge. In 
clocked mode MAOE# is sampled on a SNPCLK 
edge. In strobed mode MAOE# is sampled with the 
falling edge of SNPSTB#. If MAOE# is sampled ac- 
tive, the snoop will be ignored. This allows 
SNPSTB# to share a common line for multiple 
82495XPS. 

MAOE# need not meet any setup or hold time if it is 
not being sampled during a snoop cycle. 

7.25.3 RELATION TO OTHER SIGNALS 

MAOE# together with MBAOE# control the entire 
82495XP address. Both signals are asynchronous 
and thus need never be synchronized to any clock. 
Both signals are, however, sampled during snoop 
cycles and require proper setup and hold times in 
these situations. 



MALE and MAOE# together provide full control 
over the 82495XP address output latch. 



7.26 MBALE 

Memory Burst Address Latch Enable 
Tristates/Enables Memory Burst Address Outputs 
Input to 82495XP (pin P4) Cycle Control Signal 
Asynchronous 

7.26.1 SIGNAL DESCRIPTION 

The 82495XP address latch is controlled by four sig- 
nals: MAOE#, MBAOE#, MALE, and MBALE. The 
signals MALE and MBALE control the latching of the 
entire 82495XP address where MBALE controls the 
subline portion and MALE controls the rest. 

MALE and MBALE are provided so that the memory 
bus controller has complete flexibility when the next 
address is driven. With MBALE high, the subline por- 
tion of the 82495XP address latch is in "flow- 
through" mode and the 82495XP subline address is 
available at the memory bus. Changes in the 
82495XP subline address are seen immediately at 
the memory bus. When MBALE is driven low the 
subline address at the latch input is latched. Any 
subsequent subline address driven by the 82495XP 
will not be seen at the memory bus outputs until 
MBALE is driven high again. 

MBALE will latch 82495XP addresses regardless of 
the state of MAOE# or MBAOE#. If MBAOE# is 
inactive, MBALE will still operate the latch properly, 
but the subline portion of the memory bus will be 
tristated. 

Separate line and subline address latch controls are 
provided so that the latch outputs may be driven at 
different times. The table below indicates the subline 
address bits for each line size. 




Line Size (Bytes) 


Subline Address 


32 


A3, A4 


64 


A4,A5 


128 


A5,A6 



7.26.2 WHEN SAMPLED 

MBALE is asynchronous and can be asserted and 
deasserted at any time. MBALE should always be 
driven to a valid state since it directly controls the 
operation of the address latch. 
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7.26.3 RELATION TO OTHER SIGNALS 

MALE together with MBALE control the latching of 
the entire 82495XP output address. The other latch 
control signals, MAOE# and MBAOE#, provide the 
memory bus controller complete command over the 
address outputs. MAOE# and MBAOE# do not af- 
fect the operation of MALE or MBALE. 

MBALE shares a pin with the HIGHZ# configuration 
pin. 



7.27 MBAOE# 

Memory Burst Address Output Enable 
Tristates/ Enables Memory Subline Address Outputs 
Input to 82495XP (pin P6) Cycle Control Signal 
Asynchronous except during snoop cycles 



must meet proper setup and hold times to CLK's 
rising edge. In clocked mode, MBAOE# must meet 
setup and hold times to SNPCLK's rising edge. In 
strobed mode, MBAOE# must meet setup and hold 
times to SNPSTB#'s falling edge. 

If MBAOE# is not being sampled for a snoop, ie. 
SNPSTB# is not asserted, MBAOE# need not meet 
any setup or hold time. 

7.27.3 RELATION TO OTHER SIGNALS 

MAOE# and MBAOE# control the entire 82495XP 
address output asynchronously/This address latch 
is completely controlled by MALE, MBALE, MAOE#, 
and MBAOE#. 

MBAOE# is only sampled by the 82495XP during 
snoop cycles with SNPSTB#. 



7.27.1 SIGNAL DESCRIPTION 

The 82495XP address latch is controlled by four sig- 
nals: MAOE#, MBAOE#, MALE, and MBALE. 
MAOE# and MBAOE# are the output enables of 
this latch for the entire 82495XP address. Specifical- 
ly, MBAOE# controls the subline address portion 
and MAOE# controls the rest. 

MBAOE# has two functions. One, it can tristate the 
subline portion of the address separately from the 
rest of the address. Since the 82495XP does not 
sequence through burst addresses, the memory sys- 
tem may wish to provide the burst count. This re- 
quires that the 82495XP address burst portion be 
tristated after the first transfer. The Subline Address 
table appears in Section 7.26, MBALE. 

Two, MBAOE# is sampled during snoop cycles. If 
MBAOE# is sampled inactive, the snoop write back 
cycle, if any, will begin at the subline address provid- 
ed. If MBAOE# is sampled active, the snoop write 
back will begin at subline address 0. This allows 
snoop write backs to begin at the snooped subline 
address and progress through the normal burst or- 
der. 



7.27.2 WHEN SAMPLED 

Like MAOE#, MBAOE# is asynchronous except 
during snoop cycles and can be asserted or deas- 
serted at any time. Since MBAOE# has direct con- 
trol over the address latch, it must always be driven 
to a valid state. 

MBAOE# is .however, sampled during snoop cy- 
cles. In synchronous snooping mode, MBAOE# 



7.28 MBRDY# 

Memory Burst Ready 

Burst Ready input to 82490XP memory buffers 
Input to 82490XP (pin 22) Cycle Progress Signal 
Synchronous to MCLK 

7.28.1 SIGNAL DESCRIPTION 

When in clocked memory bus mode, MBRDY# (with 
MSEL# active) is used to advance the memory 
burst counter for the 82490XP buffer in use. This 
causes either new data to be latched from the mem- 
ory bus (read cycle), or new data to be driven from 
the 82490XP buffer (write cycle). MBRDY# is sam- 
pled on all MCLK edges in which MSEL# is sampled 
active and has no relation to CLK. In strobed mode, 
MBRDY# must be tied high as MISTB/MOSTB 
strobes data in/out of the 82490XP. 

For write cycles, the first piece of write data is avail- 
able at the MDATA pins. MBRDY# assertion with 
MSEL# active causes the next 32, 64, or 128-bit 
slice of write data to be available. If only one slice is 
required, MSEL# and MBRDY# need never go ac- 
tive. 

For read cycles, the first piece of read data flows 
through to the CPU. MBRDY# assertion with 
MSEL# active causes the next slice of memory data 
to be latched in the 82490XP buffer. BRDY# asser- 
tion will allow this data to be available on the CPU 
bus and latch it into the CPU. For cacheable cycles, 
MBRDY# needs to be asserted 4 or 8 times de- 
pending on the cache configuration. 
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7.28.2 WHEN SAMPLED 

MBRDY# is sampled on all MCLK edges where 
MSEL# is sampled active. In this way MSEL# quali- 
fies the MBRDY# input. If MSEL# is sampled inac- 
tive, MBRDY# need not follow setup and hold times 
to MCLK. 



7.29.3 RELATION TO OTHER SIGNALS 

Address and cycle specification signals (MSETO- 
MSET10, MTAG0-MTAG11, MCFA0-MCFA6, 
CW/R#, CM/IO#, CD/C#, RDYSRC, MCACHE#, 
NENE#, SMLN#, KLOCK#, and CPLOCK#) will be 
valid with CADS#. 



7.28.3 RELATION TO OTHER SIGNALS 

MBRDY# is qualified by the MSEL# input. 
MBRDY# advances the memory burst counter for 
the 82490XP in use which either inputs or outputs 
data through MDATA. 

MEOC# switches the 82490XP buffers to the next 
pending cycle, so the last MBRDY# must come be- 
fore or on the clock of MEOC# assertion. 



7.29 MCACHE# 

82495XP Internal Cacheability 

Indicates cycle cacheability attribute 

Output from 82495XP (pin C2) Cycle Control Signal 

Synchronous to CLK 

7.29.1 SIGNAL DESCRIPTION 

MCACHE# is driven by the 82495XP and indicates 
that the current cycle may be cached. Data cachea- 
bility is determined later in the cycle by MKEN# as- 
sertion. MCACHE# is asserted for allocation, re- 
placement write-back cycles, and during cacheable 
read-miss cycles, (ie. read-miss cycles in which PCD 
is not asserted). It is not asserted for IO, special, or 
locked cycles. 



Cycle Type 


MCACHE# 


Posted Writes 


1 


Write Backs 





Read, PCD = 





Read, PCD = 1 


1 


Allocation 





I/O Cycles 


1 


Locked Cycles 


1 



7.29.2 WHEN DRIVEN 

MCACHE# is valid in the same CLK as CADS# and 
remains valid until CRDY# or CNA#. 



7.30 MCFA0-MCFA6 
MSET0-MSET10 
MTAG0-MTAG11 

MCFA0-MCFA6 Memory Configuration Address I/O 

MSET0-MSET10 Memory Set Address I/O 

MTAG0-MTAG11 Memory Tag Address I/O 

82495XP Memory Address Inputs/Outputs 

Input/Output of 82495XP (pins N14, P7-P15, 06- 
016, R4, R14-R17, S14-S17) Cycle Control Sig- 
nals 

Input Synchronous to CLK, SNPCLK, or SNPSTB#. 

Output from CLK, MAOE# active or MALE high. 

7.30.1 SIGNAL DESCRIPTION 

MSETO-10, MTAGO-1 1, and MCFAO-6 provide the 
complete 30 bit address input/output interface of 
the 82495XP to the memory bus. Together they 
span the entire CPU address range A2-A31. De- 
pending on the cache configuration, each pin repre- 
sents a different CPU address line (see configura- 
tion section for details). 

MSETO-10, MTAGO-1 1, and MCFAO-6 pass 
through a 82495XP output latch. The latching of this 
latch is controlled by MALE/MBALE, and the output 
of this latch is controlled by MAOE#/MBAOE#. 

With MAOE#/MBAOE# active, MSET/MTAG/ 
MCFA are 82495XP outputs. They are valid at the 
start of a memory bus cycle at the input of the 
82495XP address latch. If MALE/MBALE is high 
(flow-through) and MAOE#/MBAOE# is active 
(outputs enabled), they are driven to the memory 
bus with CADS #. 

If a new cycle starts and MALE/MBALE is low, the 
previous address remains valid at the 82495XP 
MSET/MTAG/MCFA outputs. Once MALE/MBALE 
goes high, the new address flows through with the 
appropriate propagation delay (MSET/MTAG/ 
MCFA address valid delay from MALE/MBALE go- 
ing high). The new address will be driven to the 
82495XP MSET/MTAG/MCFA outputs if MAOE#/ 
MBAOE# is active. 
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If a new cycle starts, MALE/MBALE is high, and 
MAOE#/MBAOE# is inactive, the 82495XP MSET/ 
MTAG/MCFA outputs will remain tristated. Once 
MAOE#/MBAOE# is asserted, the new address 
flows through with the appropriate propagation delay 
(MSET/MTAG/MCFA address valid from MAOE#/ 
MBAOE# going active). 

MSETO-10, MTAGO-11, and MCFAO-6 are used 
as inputs to the 82495XP during snoop cycles. Here, 
MAOE#/MBAOE# is inactive. MSET/MTAG/ 
MCFA are sampled by the 82495XP during snoop 
initiation just like the other snoop attributes. 

7.30.2 WHEN SAMPLED 

If MALE/MBALE is high and MAOE#/MBAOE# is 
low, MSETO-10, MTAGO-11, and MCFAO-6 are 
valid with CADS# with a timing reference to CLK. 
Otherwise, thev are asserted with a delay from 
MALE/MBALE high or MAOE#/MBAOE# active. 

MSETO-10, MTAGO-11, and MCFAO-6 change 
once CNA# or CRDY# is sampled active. MSETO- 
10, MTAGO-11, and MCFAO-6 have a float delay 
from MAOE#/MBAOE# going inactive. These out- 
puts are undefined after CRDY#/CNA# assertion 
and before the next CADS # assertion. 

As inputs during snoop cycles (SNPSTB# asserted), 
they must be sampled like other snoop attributes 
with proper setup and hold times. In synchronous 
snoop mode this is with respect to CLK; in clocked 
mode, this is with respect to SNPCLK; and in 
strobed mode this is with respect to SNPSTB# fall- 
ing edge. 

If MAOE# is inactive and SNPSTB# is not asserted 
(no snoop), MSETO-10, MTAGO-11, and MCFA0- 
6 need not meet any setup or hold time. 

7.30.3 RELATION TO OTHER SIGNALS 

MSETO-10, MTAGO-11, and MCFAO-6 are assert- 
ed with CADS# so they are valid when CADS# is 
sampled active. This is true as long as MALE/MBA- 
LE is high and MAOE#/MBAOE# is active. If 
MSETO-10, MTAGO-11, and MCFAO-6 have been 
asserted but are blocked by MALE/MBALE or 
MAOE#/MBAOE#, they are asserted from MALE/ 
MBALE going high or MAOE#/MBAOE# going ac- 
tive. 

MSETO-10, MTAGO-11, and MCFAO-6 are deas- 
serted or changed with CADS# or CNA# active. 
They may also be floated with MAOE# going inac- 
tive. 



MSETO-10, MTAGO-11, and MCFAO-6 are used 
as inputs during snoop, cycles. They are sampled 
with SNPSTB# like any other snoop attribute signal. 



7.31 MCLK 

Memory Bus Clock 

Input to the 82490XP (Pin 26) 

7.31.1 SIGNAL DESCRIPTION 

In a clocked memory bus mode, this pin provides the 
memory bus clock. Memory bus signals and memory 
bus data are sampled on the rising edge of MCLK. 
Memory bus write data is driven off MCLK or 
MOCLK depending upon the configuration. MCLK 
has no relation to CLK. 

7.31.3 RELATION TO OTHER SIGNALS 

MCLK shares a pin with MISTB. 

In clocked memory bus mode, the MDATA7- 
MDATAO, MSEL#, MFRZ#, MBRDY#, MZBT#, 
and MEOC# pins are sampled synchronously with 
the rising edge of MCLK. In a clocked memory bus 
write, MDATA7-MDATA0 are driven synchronous 
with MCLK or MOCLK. 

MOCLK is a delayed version of MCLK. If a clocked 
memory bus configuration is chosen, and the 
MOCLK rising edge is detected by the 82490XP af- 
ter RESET, data will be driven off of MOCLK rather 
then MCLK. Only data is effected by MOCLK. 
MOCLK is used to allow the system designer to in- 
crease the minimum output time of MDATA relative 
to MCLK. 



7.32 MDATA0-MDATA7 

Memory Bus Data Pins 

82490XP Connection to the Memory Bus 

Input/Output of 82490XP (pins 18, 14, 10, 6, 16, 12, 
8, 4) 

Synchronous to CLK or MCLK or MOCLK or MISTB 
or MOSTB. 

7.32.1 SIGNAL DESCRIPTION 

MDATAO-7 is the 82490XP data bus connection to 
the memory bus. All or part of these pins will be used 
depending on the cache configuration. These pins 
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are directly controlled by the MDOE# input. With 
MDOE# inactive, these pins are tristated and may 
be used as inputs. 

For write cycles, the 82495XP asserts CDTS# to 
indicate that data will be available at the MDATA 
pins or in its buffer. Data is output with respect to 
CLK, MCLK, MOCLK, or MEOC# and is strobed 
with MBRDY#. In strobed memory bus mode, data 
is output using MOSTB. 

For read cycles, CDTS# indicates that the CPU data 
path will be available for read data in the next clock. 
BRDY# reads data into the CPU from the 82490XP. 
Data is read into the 82490XPs through MDATA us- 
ing MBRDY# or MISTB. 

7.32.2 WHEN DRIVEN 

When the CPU or 82495XP initiates a write cycle, 
the write data is written to the appropriate 82490XP 
buffer and CDTS# is asserted. If MDOE# is active, 
that first piece of write data will be available at the 
MDATA pins with some delay from the CPU CLK 
edge that CDTS# is asserted. Subsequent pieces of 
write data are output with some delay from MCLK or 
MOCLK (mode dependent) from the edge that 
MBRDY# is sampled active. In strobed mode, sub- 
sequent data is output with MOSTB assertion. 

MDATA has no value before CDTS# assertion, after 
MEOC# with no pending cycle, or with MDOE# in- 
active. 

For read cycles, the 82495XP asserts CDTS# the 
clock before the MDATA path is available for read 
data. MDOE# must be inactive for the 82490XP to 
read data. Read data is strobed into the 82490XP by 
asserting MBRDY# on MCLK edges. MEOC# will 
latch the last piece data as it switches buffers. In 
strobed mode, data is read by MISTB. Data that is 
read into MDATA must meet proper setup and hold 
times. 

Data at the MDATA inputs need not follow setup and 
hold times to MCLK edges that sample MBRDY# 
inactive. 

7.32.3 RELATION TO OTHER SIGNALS 

CDTS# indicates that write data is in the 82490XP 
buffers. If MDOE# is active, write data is available at 
MDATA some time after CDTS# or MEOC# is sam- 
pled active. Subsequent write data is available at 
MDATA after MBRDY# assertion or MOSTB chang- 
ing. 



MDOE# must be inactive for MDATA to read data. 
CDTS# assertion by the 82495XP indicates that the 
read path is available in the next clock. Data must be 
read into MDATA with respect to MCLK or MISTB 
and must follow proper setup and hold times if 
MBRDY# is active or MISTB is changing. 

The memory bus controller must account for the 
large setup time required to read data into the CPU. 
If properly done, data can be read into MDATA by 
asserting MBRDY# and in the next full CPU clock 
read into the CPU using BRDY#. 



7.33 MDOE# 

Memory Data Output Enable 
Tristates/Enables Memory Data Outputs 
Input to 82490XP (pin 20) Cycle Control Signal 
Asynchronous 

7.33.1 SIGNAL DESCRIPTION 

MDOE# is an input to the 82490XP that, when as- 
serted, causes the 82490XP to drive its MDATA0- 
MDATA7 outputs. When MDOE# is inactive, these 
lines are floated and may be used as inputs to the 
82490XP. MDOE# is not sampled by any clock and 
is a direct connection to the 82490XP memory ouput 
driver. 

7.33.2 WHEN SAMPLED 

Since MDOE# is a direct connection to the 
82490XP memory output drivers, MDOE# must al- 
ways be driven to a valid level. With MDOE# inac- 
tive, data in the 82490XP's may be driven to MDATA 
outputs with some propagation delay from MDOE# 
going active. Similarly, there is some float delay from 
MDOE# going inactive. 

MDOE# must be inactive for the 82490XP to read 
memory data. 

7.33.3 RELATION TO OTHER SIGNALS 

MDOE# has no relation to MCLK, MOCLK, or 
MOSTB. Since MDOE# controls the final stage of 
the MDATA output buffers* it has no effect on any 
other signal of the 82490XP. 



7.34 MEMLDRV 

Memory Low Capacitance Drivers 

Selects the Low Capacitance Drivers for the 
82495XP and the 82490XP 
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Inputs to 82495XP and 82490XP (pins Q4, 24) Con- 
figuration Signal 

Synchronous to CLK 

7.34.1 SIGNAL DESCRIPTION 

MEMLDRV is a pin on both the 82495XP and 
82490XP that, when high during reset, select normal 
driving memory output buffers. If this pin is driven 
low at reset, the high capacitance drivers are select- 
ed. Specifically, these are the 82495XP address out- 
puts to the memory bus, and the 82490XP MDATA 
outputs. The normal output drivers are designed to 
drive up to 50 pF loads. The high capacitance driv- 
ers can drive up to 100 pF without derating. 

7.34.2 WHEN SAMPLED 

MEMLDRV Is sampled like figure 7-1 with a setup 
time of 4 CPU clocks for the 82495XP and 1 CPU 
clock for the 82490XP. On the 82495XP, MEMLDRV 
becomes the SYNC# input once FSIOUT# goes 
inactive. On the 82490XP, MEMLDRV becomes the 
MFRZ# signal which is sampled after the first mem- 
ory cycle begins. 

7.34.3 RELATION TO OTHER SIGNALS 

MEMLDRV shares a pin with SYNC# on the 
82495XP and MFRZ# on the 82490XP. 



7.35 MEOC# 

Memory End of Cycle 

Ends a cycle in 82490XP by switching buffers 

Input to 82490XP (pin 23) Cycle Control Signal 

Synchronous to MCLK or Asynchronous (strobed 
mode) 

7.35.1 SIGNAL DESCRIPTIONS 

MEOC# is an input to the 82490XP that ends the 
current cycle and switches memory buffers for new 
cycle. Switching to the next cycle does not cause 
information to be lost in the memory or CPU buffers 
in the 82490XP, but rather switches new buffers to 
the memory I/O bus of the 82490XP. 

MEOC# is provided so that the memory system, 
which is synchronous to MCLK, can switch to a new 
cycle without synchronization. In clocked memory 
bus mode MEOC# is sampled with the rising edge 
of MCLK. In strobed memory bus mode the MEOC# 
function is performed with rising or falling edges of 
MEOC#. 



For read or write cycles, MEOC# may be activated 
on or after the clock edge of the last MBRDY# of 
the current cycle. If a cycle is pending (pipelining is 
used), the next cycle will flow-through with a propa- 
gation delay from MEOC# assertion. MEOC# is re- 
quired for all memory bus cycles. 

In addition to switching memory buffers, MEOC# 
does three other things. One, MEOC# activation 
causes the memory burst counter to be reset to its 
start value and if MSEL# is active, MZBT# is sam- 
pled. This allows MSEL# to stay active between cy- 
cles. Two, MEOC# activation during a write cycle 
causes MFRZ# to be sampled for the a subsequent 
allocation (line-fill). Three, MEOC# latches in the 
last slice of data (like MBRDY#) before switching 
buffers. 



7.35.2 WHEN SAMPLED 

In clocked memory bus mode, MEOC# is sampled 
on every MCLK edge. It must always observe setup 
and hold times to MCLK. In strobed memory bus 
mode, MEOC# is always sampled and must meet 
proper active/inactive times. 



7.35.3 RELATION TO OTHER SIGNALS 

MEOC# is provided so that a cycle may end on the 
memory bus before CRDY# can be asserted. The 
implication rules surrounding MEOC# are: 

1. MEOC# <; CRDY# 

2. MEOC# for cycle N + 1 ^ 2 clocks after CRDY# 
of cycle N 

3. MEOC# for cycle N + 1 ^ 2 clocks after last 
BRDY# of cycle N 

4. MEOC# ^ BGT# 

MEOC# active with MSEL# active causes the sam- 
pling of MZBT# and MFRZ#. 



7.36 W3FRZ# 

Memory Data Freeze 

Freezes Memory Write Data in 82490XP Buffer 
Input to 82490XP (pin 24) Cycle Control Signal 
Synchronous to MCLK or Strobed 

7.36.1 SIGNAL DESCRIPTION 

MFRZ# is an input to the 82490XP that when active 
causes the 82490XP to "freeze" write data in the 
82490XP memory buffer and allow a subsequent al- 
location to fill a cache line around it. MFRZ# is pro- 
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vided so that an actual write to memory need not be 
done to perform an allocation. Using MFRZ# to per- 
form this dummy write cycle requires that the memo- 
ry bus controller put the allocated line into the "M" 
state. 

PALLC# must be active and MKEN# must be re- 
turned active for the write cycle to be turned into an 
allocation. MFRZ# is sampled when MEOC# goes 
active at the end of the write cycle. The subsequent 
line fill is then filled around the write data to com- 
plete the allocation. 

7.36.2 WHEN SAMPLED 

In clocked memory bus mode, MFRZ# is sampled 
with the MCLK rising edge that MEOC# is sampled 
active for all CPU write cycles. MFRZ# need only 
follow a proper setup and hold time in this situation. 

In strobed mode, MFRZ# is sampled with the falling 
edge of MEOC# for write cycles. MFRZ# need only 
follow a proper setup and hold time in this situation. 

7.36.3 RELATION TO OTHER SIGNALS 

MFRZ# is sampled with the MEOC# going active or 
being active for write cycles. MFRZ# is used so that 
a dummy write cycle can be performed. If an alloca- 
tion is done, DRCTM# must be asserted during the 
SWEND# window of the line fill to put the allocated 
line in the "M" state. 

MFRZ# shares a pin with the MEMLDRV configura- 
tion input. 



7.37 MHITWi# 

Memory Bus Hit [M] 

Indicates snoop hit to modified line 

Output from 82495XP (pin H4) Snooping Signal 

Sync to CLK 

7.37.1 SIGNAL DESCRIPTION 

The MHITM# output is driven by the 82495XP dur- 
ing a snoop cycle to indicate that the snooping ad- 
dress has hit a Modified line. If the signal is logic 
high, the snoop has not hit a modified line; if the 
signal is logic low, the snoop has hit a modified line. 
When a snoop hits a modified line, the 82495XP au- 
tomatically schedules a write-back of the hit modi- 
fied line to the memory bus. 



When the device which controls the memory bus 
(the master) performs a memory access, a snoop is 
requested of all other caching devices on the bus 
(snoopers). An asserted MHITM# pin from any of 
the snooper 82495XPs alerts the master that main 
memory's data is stale, and that the bus must be 
temporarily given to the snooper which has its 
MHITM# asserted so that the modified line can be 
written out to the memory bus. 

7.37.2 WHEN DRIVEN 

The snoop lookup is performed in the clock in which 
SNPCYC# is asserted. The MHITM# result for the 
snoop is driven on the CLK following SNPCYC#, 
and remains valid until the next assertion of 
SNPSTB#. The MHITM# signal is not valid from 
SNPSTB# until the CLK after SNPCYC#. 

7.37.3 RELATION TO OTHER SIGNALS 

MHITM# and MTHIT# outputs together indicate the 
results of a snoop lookup in the 82495XP. 

A 82495XP can accept a snoop request while per- 
forming memory bus transfers of its own. If a snoop 
is requested of a 82495XP while it is performing a 
data transfer of its own, the results of the snoop may 
be delayed. If SNPSTB# is sampled at a 82495XP 
after it has received BGT# for its own cycle, the 
snoop lookup is performed (SNPCYC# active) after 
the SWEND# of its own cycle, and MHITM# is driv- 
en with valid results one CLK after SNPCYC# (see 
Sections 6.2.4 and 6.2.5). 



7.38 MISTB 

Memory Bus Input Strobe 

Strobes data into the 82490XP 

Input to 82490XP (pin 22) Cycle Control Signal 

Asynchronous 

7.38.1 SIGNAL DESCRIPTION 

MISTB is an input to the 82490XP that, on rising or 
falling edges, causes the 82490XP to latch its MDA- 
TA inputs. MISTB is used in strobed memory bus 
mode. In clocked memory bus mode, MISTB is the 
MBRDY# input. 
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7.38.2 WHEN SAMPLED 

MISTB is always sampled by the 82490XP. MISTB 
must meet proper strobed mode active and inactive 
times. 



7.38.3 RELATION TO OTHER SIGNALS 

MISTB causes the latching of the 82490XP MDATA 
inputs in strobed mode. MISTB shares a pin with 
MBRDY#. 



7.39 MKEM# 

Memory Cache Enable 
Determines 82495XP and CPU cacheability 
Input to 82495XP (pin R1) Cycle Attribute Signal 
Synchronous to CLK 

7.39.1 SIGNAL DESCRIPTION 

MKEN# is an input to the 82495XP that is sampled 
at the closing of the cacheability window (KWEND# 
is sampled active). The 82495XP drives KEN # back 
to the CPU one clock after sampling the value of 
MKEN#. MKEN# thus determines whether the cur- 
rent cycle is cacheable in the 82495XP and in the 
CPU. 

For read cycles, if MCACHE# is active (cacheable), 
KEN# is driven out of the 82495XP to the CPU to 
indicate cacheability. If MKEN# is sampled inactive 
during KWEND# activation, KEN# is brought inac- 
tive by the 82495XP, and the line will not be cache- 
able by the CPU or 82495XP. If MCACHE# is inac- 
tive, the line will be non-cacheable regardless of 
MKEN#. PCD active will cause MCACHE# to be 
inactive. 

MKEN# is sampled during write-through cycles that 
are potentially allocatable (PALLC# is active during 
the write cycle). If MKEN# is sampled active during 
KWEND# activation of the write cycle, an allocation 
will occur, and a line-fill will follow the write cycle. 
MKEN# during the line-fill is ignored. The MBC indi- 
cates to the 82495XP that it intends to perform an 
allocation by asserting MKEN#. 

MKEN# must be sampled 1 clock before the first 
BRDY# assertion to make a line-fill non-cacheable 
to the CPU. 

7.39.2 WHEN SAMPLED 

MKEN# is sampled on the clock edge that 
KWEND# is first sampled active. In all other places 
MKEN# may violate setup and hold times. 



7.39.3 RELATION TO OTHER SIGNALS 

MKEN# and MRO# are sampled with KWEND# 
active. M KEN # must be sampled at least 2 clocks 
before BRDY# assertion to make a line-fill non- 
cacheable. 



7.40 MOCLK 

Memory Data Output Clock 

Separate Clock Reference for Memory Data Output 

Input to 82490XP (pin 27) 

Asynchronous 

7.40.1 SIGNAL DESCRIPTION 

MOCLK is the latch enable for the 82490XP memory 
data outputs (MDATA). MOCLK controls the latching 
of a transparent latch which, when high, causes 
MDATA to be driven from MCLK. When low, MDATA 
is latched. MOCLK may only be used in clocked 
memory bus mode and only affects output data. It is 
provided so that a greater MDATA output hold time 
can be generated. 

To be used effectively, MOCLK must be a clock in- 
put that is skewed from MCLK. The following picture 
shows how MOCLK has increased the hold time of 
the output burst data: 
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7.40.2 WHEN SAMPLED 

MOCLK is sampled during and after RESET to de- 
termine whether output data should be driven from 
MCLK or MOCLK. If toggling, MOCLK controls the 
MDATA outputs with MCLK. If high, data is driven 
from MCLK alone. Regardless, input data is never 
referenced to MOCLK. 

In strobed memory bus mode the MOCLK signal be- 
comes MOSTB. 'MOCLK is only used in clocked 
memory bus mode. 
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7.40.3 RELATION TO OTHER SIGNALS 

To be used effectively, MOCLK must be the same 
frequency as MCLK but be skewed. This effectively 
increases MDATA hold time to main memory. Main 
memory must sample the data on MCLK edges. 

MOCLK shares a pin with the MOSTB signal. 



7.41 MOSTB 

Memory Bus Output Strobe 

Strobes data out of 82490XP 

Input to 82490XP (pin 27) Cycle Control Signal 

Asynchronous 

7.41.1 SIGNAL DESCRIPTION 

MOSTB is an input to the 82490XP that, on rising 
and falling edges, causes the 82490XP to output 
data through its MDATA outputs. MOSTB is only 
used in strobed memory bus mode. In clocked mem- 
ory bus mode, MOSTB is the MOCLK input. 

7.41.2 WHEN SAMPLED 

MOSTB is always sampled by the 82490XP. MOSTB 
must meet strobed mode active and inactive times. 



state, and causes the line to be non-cacheable to 
the CPU. Writes to read-only lines in the 82495XP 
are treated as write-misses that are non-allocatable 
(PALLC# is inactive). MRO# is a bit in each 
82495XP tag entry. 

Once MRO# is sampled active during KWEND# ac- 
tivation, KEN# to the CPU is driven inactive regard- 
less of the state of MKEN#. MKEN# does, howev- 
er, determine whether the 82495XP will cache the 
read-only line. Once MRO# is returned active, the 
CPU will only require the number of transfers as indi- 
cated by LEN and CACHE #. If MKEN# is returned 
active, the 82495XP will require an entire cache line. 
82495XP read-only cache lines are filled to the [S] 
state. 

The line-fill portion of an allocation may be filled to 
the read-only state by returning MRO# active during 
KWEND# of the line-fill. MRO# is ignored during 
the write portion. 

If MRO# is returned active during KWEND#, 
DRCTM# and MWB/WT# are ignored during 
SWEND#. 

MRO# must be returned to the 82495XP at least 2 
clocks before BRDY# is returned to the CPU so 
KEN# can be sampled properly. 

There is one Read-Only bit per tag in the 82495XP. 




7.41.3 REALTION TO OTHER SIGNALS 

MOSTB strobes data out of the 82490XP through 
MDATA. MOSTB shares a pin with MOCLK. 



7.42'RflRO#' 

Memory Read-Only 

Designates current line as read-only 

Input to 82495XP (pin J1) Cycle Attribute Signal 

Synchronous to CLK 

7.42.1 SIGNAL DESCRIPTION 

MRO# is an input to the 82495XP that is sampled at 
the closing of the cacheability window (KWEND# 
activation). If sampled active, it causes the current 
line fill to the 82495XP to be put in the read-only 



7.42.2 WHEN SAMPLED 

MRO# is sampled on the first clock that KWEND# 
is sampled active. In all other clocks, MRO# need 
not follow setup and hold times. 



7.42.3 RELATION TO OTHER SIGNALS 

MRO# and MKEN# are sampled with KWEND# 
activation. MRO# must be returned at least 2 clocks 
prior to the first BRDY#. 



7.43 WlSEL# 

Memory Buffer Chip Select 
Selects 82490XP, Causes Sampling of MZBT# 
Input to 82490XP (pin 25) Cycle Control Signal 
Synchronous to MCLK or Strobed 
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7.43.1 SIGNAL DESCRIPTION 

MSEL# is an input to the 82490XP that has 3 main 
functions. One, MSEL# active qualifies the 
MBRDY# input to the 82490XP. If MSEL# is inac- 
tive for a particular 82490XP, MBRDY# will not be 
recognized by that 82490XP. 

Two, MSEL# going active causes the sampling of 
MZBT# for the next transfer. 

Three, MSEL# going inactive resets the 82490XP 
internal memory burst counter. The 82490XP con- 
tains a memory burst counter that counts through 
the CPU burst order with each MBRDY# assertion 
and increments a pointer to the 82490XP memory 
buffer being accessed. 

MSEL# going inactive will reset this burst counter to 
its original burst value. By resetting this counter be- 
fore MEOC# assertion, all information currently be- 
ing read into the 82490XP is lost, but information 
that is being written out is maintained and may be 
rewritten. 

In general, MSEL# may stay inactive for single 
transfer cycles such as posted 64-bit write cycles. 
Once active, MSEL# need not go inactive as the 
burst counter is reset with MEOC# activation. Since 
MZBT# may also be sampled with MEOC#, it is 
possible to leave MSEL# asserted throughout most 
basic transfers. 

MSEL# or MEOC# must be used to reset the burst 
counter before any transfer begins. If transfers are 
interrupted (by a snoop hit before BGT# assertion 
for example), MSEL# must be brought inactive so 
the burst counter may be reset for the snoop write 
back. 

MSEL# must be sampled inactive for at least 1 
MCLK after reset. This resets the memory burst 
counter for the first transfer. 



7.43.2 WHEN SAMPLED 

In clocked memory bus mode, MSEL# is sampled 
with all rising edges of MCLK. In this mode, if 
MSEL# is sampled inactive, the memory burst 
counter is reset and MZBT# is sampled. If MSEL# 
is sampled active and MBRDY# is sampled active, 
the memory burst counter is incremented. Since it is 
constantly sampled with MCLK, MSEL# must al- 
ways be driven to a known state and must always 
meet setup and hold times to every MCLK edge. 



In strobed mode, MSEL# falling edge causes the 
sampling of MZBT#. While MSEL# is active, MISTB 
and MOSTB cause the memory burst counter to be 
incremented. The rising edge of MSEL# causes the 
memory burst counter to be reset. 

MSEL# must be inactive sometime after RESET be- 
fore the first transfer to initialize the burst counter. 



7.43.3 RELATION TO OTHER SIGNALS 

MSEL# causes the sampling of MZBT#, and quali- 
fies the use of MBRDY#, MOSTB, and MISTB. 
Since MSEL# acts as a qualifier for these signals, 
MSEL# may be asserted at the same time as 
MBRDY#, MOSTB, or MISTB. 



7.44 MTWT# 

Memory Bus Tag Hit 

Indicates snoop hit 

Output from 82495XP (pin G3) Snooping Signal 

Sync to CLK 

7.44.1 SIGNAL DESCRIPTION 

The MTHIT# output is asserted by the 82495XP 
during snoop cycles to indicate that the snoop ad- 
dress has hit a line in the 82495XP cache. An as- 
serted MTHIT# signal from any of the snooping 
82495XP's alerts a bus master that the data being 
accessed resides in another cache. If SNPINV was 
not asserted on the snoop request, the copy of the 
data in a 82495XP asserting MTHIT# will remain 
valid and in the Shared state — so a caching master 
must also place his copy of the data in the Shared 
state. 

7.44.2 WHEN DRIVEN 

The snoop lookup is performed in the CLK in which 
SNPCYC# is asserted. The MTHIT# result for the 
snoop is driven on the next CLK and remains valid 
until the next assertion of SNPSTB#. The MTHIT# 
signal is not valid from SNPSTB# until the CLK after 
SNPCYC#. 

7.44.3 RELATION TO OTHER SIGNALS 

MTHIT# and MHITM# together indicate the results 
of a snoop lookup in the 82495XP. 
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An 82495XP can accept a snoop request while per- 
forming memory bus transfers of its own. If a snoop 
is requested while it is performing a transfer of its 
own, the results of the snoop may be delayed. If 
SNPSTB# is sampled at a 82495XP after it has re- 
ceived BGT# for its own cycle, the snoop lookup is 
performed (SNPCYC# active) after the SWEND# of 
its own cycle, and MTHIT# is driven with the valid 
result one CLK after SNPCYC# (see Sections 6.2.4 
and 6.2.5). 

Because an asserted MTHIT# from any snooping 
82495XP requires the master to place the fetched 
line in the Shared state (unless it is an invalidating 
snoop), the memory bus controller should include 
the MTHIT# signals of other processors when gen- 
erating the MWB/WT# signal to its own 82495XP. 



7.45 MWB/WT# 

Memory Write-back/Write-through 
Forces lines to be filled to the [S] state 
Input to 82495XP (pin K3) Cycle Attribute Signal 
Synchronous to CLK 

7.45.1 SIGNAL DESCRIPTION 

MWB/WT# is an input to the 82495XP that is sam- 
pled at the closing of the snoop window (SWEND# 
activation). If sampled active, the current line-fill is 
filled to the [S] state in the 82495XP. The [S] state 
is a write-through state in the 82495XP. 

MWB/WT# is used in many cases. If a cache' to 
cache transfer updates memory and leaves the data 
valid in the other cache, the line must be filled to the 
[S] state instead of the [E] state default. A portion of 
memory may be designated as write-through by as- 
serting MWB/WT# for appropriate addresses. 

MWB/WT# has no effect on the 82495XP if 
DRCTM# is sampled active or MRO# has been 
sampled active during KWEND#. If PWT is active, 
MWB/WT# has no effect and the line is filled to the 
[S] state. 



7.46 MX4/MX8# 
MTR4/MTR8# 

Memory 4/8 I/O bits 

Memory 4/8 Transfers 

Selects MDATA Input/Output width and number of 
memory bus transfers 

Inputs to 82490XP (pins 21, 25) Configuration Sig- 
nals 
Synchronous to CLK 

7.46.1 SIGNAL DESCRIPTION 

MX4/MX8# configures the 82490XP to use 
MDATA[0:3] or MDATA[0:7] memory bus I/O pins. 
MTR4/MTR8# selects whether the a cache line will 
take 4 or 8 transfers. These selections depend on 
the line ratio (82495XP line size / CPU line size) and 
must be configured according to the following table: 




Line 
Ratio 


MX4/ 

MX8# 


MTR4/ 
MTR8# 


Membus 
I/O Pins 


CPUbus 
I/O Pins 


.1 


1 


1 


4 


4 


2 


1 





4 


4 


2 





1 


8 


4 


4 








8 


4 


1 





1 


8 


8 


2 








8 


8 



7.46.2 WHEN SAMPLED 

These signals are sampled like Figure 7-1 with a set- 
up time of 1 clock. Once the first CADS# is issued 
by the 82495XP these signals are sampled for the 
MZBT# and MSEL# functions. 



7.46.3 RELATION TO OTHER SIGNALS 

MX4/MX8# shares a pin with MZBT# and MTR4/ 
MTR8# shares a pin with MSEL#. 



7.45.2 WHEN SAMPLED 

MWB/WT# is sampled on the first clock edge that 
SWEND# is sampled active. If MWB/WT# is not 
being sampled, it need not follow setup and hold 
times. 



7.45.3 RELATION TO OTHER SIGNALS 

Both MWB/WT# and DRCTM# are sampled with 
SWEND#. 



7.47 WlZBT# 

Memory Zero Base Transfer 
Forces cycles to begin at subline address 
Input to 82490XP (pin 21) Cycle Control Signal 
Synchronous to MCLK or Strobed 
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7.47.1 SIGNAL DESCRIPTION 

MZBT# is an input to the 82490XP that forces a 
read or write cycle to begin with burst address 
regardless of the CPU generated address. 

MZBT# is sampled before the transfer begins. 
MZBT# is sampled with MSEL# and MEOC#. 
MZBT# is sampled with MSEL# going active for the 
current cycle. If MSEL# stays active between cy- 
cles, MZBT# is sampled with MEOC# going active 
for the previous cycle. 

Once sampled, data input to the 82490XP's will start 
at burst address and continue through 4, 8, C, etc. 
If the CPU is requesting a burst location other than 
0, the memory bus controller must hold off any 
BRDY# until that bursted item is read from the 
memory bus. 



7.47.2 WHEN SAMPLED 

In clocked mode, MZBT# is sampled in two loca- 
tions. First, MZBT# is sampled on all MCLK rising 
edges where MSEL# is sampled inactive. Once 
MSEL# is sampled active, the value of MZBT# that 
was sampled one MCLK before is used for the next 
transfer. 

Second, MZBT# is sampled on MCLK rising edges 
where MEOC# is sampled active with MSEL# ac- 
tive. The MZBT# value sampled will be used for the 
next transfer. This allows MSEL# to stay asserted 
between transfers if so desired. 

In strobed mode, MZBT# is sampled with the same 
two signals. First, it is sampled with the falling edge 
of MSEL#. Second, it is sampled with the falling 
edge of MEOC# if MSEL# is active. 

In clocked memory bus mode MZBT# must follow 
setup and hold times to all MCLK edges where 
MSEL# is sampled inactive or MEOC# is sampled 
active with MSEL# active. 

In strobed memory bus mode MZBT# must meet 
setup and hold times to MSEL# falling edge and 
MEOC# falling edge if MSEL# is active. 



7.47.3 RELATION TO OTHER SIGNALS 

MZBT# is sampled with MSEL# and MEOC# and 
has no effect otherwise. In systems that will never 
force a zero-based transfer, MZBT# may be driven 
high after RESET. 

MZBT# shares a pin with the MX4/MX8# configu- 
ration input. 



7.48 NCPFLD# 

Non-Cacheable PFLD 

Enables Non-Cacheable Floating Point Loads 

Input to 82495XP (N4) Configuration Signal 

Asychronous 

7.48.1 SIGNAL DESCRIPTION 

During RESET, this pin functions as the NCPLFD# 
configuration signal. The 82495XP can be config- 
ured to decode i860 XP CPU PFLD (Pipelined Float- 
ing Point Load) cycles. The 82495XP supports 3 op- 
erational modes for PFLD cycle decoding as defined 
by FPFLDEN and NCPFLD#: 

Mode #1. PFLD cycles that are cached in the 
82495XP. 

Mode #2. PFLD cycles not cached in the 82495XP, 
without an external PFLD extension 
FIFO. 

Mode #3. PFLD cycles not cached in the 82495XP, 
with an external PFLD extension FIFO. 



Mode # 


FPFLDEN 


NCPFLD# 


1 





1 


2 








3 


1 


1 


Illegal Mode 


1 






See Section 5.2.5 for details. 
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7.48.2 CASES IT IS ASSERTED AND 
DEASSERTED 

NCPFLD# is sampled on the falling edge of RESET 
and is a don't care at any other time. NCPFLD# 
must be valid for at least 10 CLK's before RESET's 
falling edge. 

7.48.3 RELATION TO OTHER SIGNALS 

NCPFLD# shares a pin with FLUSH #. Both 
NCPFLD# and FPFLDEN describe the PFLD mode 
used. 



7.49 NENE# 

Next Near 

Indicates current cycle address is near previous one. 
Output from 82495XP (pin D5) Cycle Control Signal 
Synchronous to CLK 

7.49.1 SIGNAL DESCRIPTION 

NENE# indicates to the MBC that the address of 
the requested memory cycle is "near" the address 
of the previously generated one (in the same 2K 
DRAM page). This information may be used by the 
MBC to optimize access to paged or static column 
DRAMs. 

7.49.2 WHEN DRIVEN 

NENE# is valid together with CADS# and will stay 
valid until CNA# or CRDY#. 

7.49.3 RELATION TO OTHER SIGNALS 

Address and cycle specification signals (MSET0- 
MSET10, MTAG0-MTAG11, MCFA0-MCFA6, 
CW/R#, CM/IO#, CD/C#, RDYSRC, MCACHE#, 
NENE#, SMLN#, KLOCK#, and CPLOCK#) will be 
valid with CADS#. 

NENE# may change state after CNA# or CRDY# 
are asserted to the 82495XP. 



7.50 PALLC# 

Potential Allocate 

Indicates 82495XP intent to allocate current cycle 
Output from 82495XP (pin D2) Cycle Control Signal 
Synchronous to CLK 



7.50.1 SIGNAL DESCRIPTION 

PALLC# indicates to the MBC that the current write 
cycle may allocate (perform a line-fill on) a cache 
line. The MBC chooses to perform an allocation by 
asserting MKEN# during KWEND# of the write cy- 
cle. Potential allocate cycles are cycles which are 
82495XP misses with PCD and PWT inactive. 

The exact condition for assertion of PALLC# is: 
Miss * !PCD * !PWT * LOCK# * W/R# * D/C# * M/IO# 

PALLC# is inactive (HIGH) for any write-hit to a 
Read-Only line. 

7.50.2 WHEN DRIVEN 

PALLC# is valid, in the same CLK as CADS# and is 
valid until CRDY# or CNA#. 

7.50.3 RELATION TO OTHER SIGNALS 

PALLC# is valid with CADS#. 



7.51 PAR# 

Parity Selection 

Selects 82490XP as a Parity Device 

Input to 82490XP (pin 32) Configuration Signal 

Synchronous to CLK 

7.51.1 SIGNAL DESCRIPTION 

PAR# is a strapping option on the 82490XP that, 
when strapped low, configures that 82490XP device 
to be a dedicated parity device. A 82490XP parity 
device must be configured the same as all the other 
devices, however, the data lines are defined differ- 
ently. CDATA[0:3] are 4 parity bit I/O lines and 
CDATA[4:7] are 4 bit select lines so each parity line 
may be written individually. Parity devices must be 
used as follows: 




Cache 
Size 


Memory 

Bus 

Width 


Number 
of Parity 
Devices 


82490XP 

I/O Bits 

(CPU:Mem) 


256K 


64 


2 


4:4 


512K 


128 


2 


4:8 
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7.51.2 WHEN SAMPLED 

PAR# is a strapping option and must be tied either 
high or low. 



7.51.3 RELATION TO OTHER SIGNALS 

PAR# affects the definition of the CDATA and MDA- 
TA lines of the 82490XP. 



7.52 RDYSRC 

Ready Source 

Cycle control signal to the MBC 

Output from 82495XP (pin C1) Cycle Control Signal 

Synchronous to CLK 

7.52.1 SIGNAL DESCRiPTiON 

RDYSRC serves as a cycle control signal to the 
MBC. It indicates the source of the BRDY# genera- 
tion (either 82495XP or MBC) for the CPU. When 
high it indicates that the MBC should generate the 
BRDY#s to the CPU, when low it indicates that the 
82495XP will provide the BRDY#s. 

RDYSRC is asserted for line-fill and not asserted for 
the write portion of allocation cycles. 

7.52.2 WHEN DRIVEN 

RDYSRC is valid in the same CLK as CADS# and is 
valid until CRDY# orCNA#. 

7.52.3 RELATION TO OTHER SIGNALS 

Address and cycle specification signals (MSETO- 
MSET10, MTAG0-MTAG11, MCFA0-MCFA6, 
CW/R#, CM/IO#, CD/C#, RDYSRC, MCACHE#, 
NENE#, SMLN#, KLOCK#, and CPLOCK#) will be 
valid with CADS #. 



7.53 RESET 

Reset 

Forces the 82495XP to begin execution in a known 
state 

Input to 82495XP (Q5) 

Asynchronous 



7.53.1 SIGNAL DESCRIPTION 

The falling edge of this signal tells the 82495XP to 
sample all configuration inputs and initializes the 
82495XP to a known state. See the specific configu- 
ration signals for setup and hold times relative to 
RESET'S falling edge. RESET can be asserted at 
any time. 

During initialization, the 82495XP LRU bits are set 
to 1 indicating that the 82495XP LRU way is way 1 . 
The 82490XP MRU bits are initialized to as are all 
tag array bits. 

RESET takes about 4100 clocks in the 82495XP. 
RESET with self-test takes about 80,000 clocks. 

7.53.2 WHEN SAMPLED 

RESET is an asynchronous input. RESET must have 
a pulse width of at least 8 CLK's in order to guaran- 
tee 82495XP recognition. 

7.53.3 RELATION TO OTHER SIGNALS 

The following signals are sampled at RESET: 



CNA#[CFG0]: 


CFG0 line of 82495XP 
configuration inputs 


SWEND#[CFG1]: 


CFG1 Hneof82495XP 
configuration inputs 


KWEND# [CFG2]: 


CFG2 line of 82495XP 
configuration inputs 


FLUSH* [NCPFLD#]: 


If low, enables decoding of 
i860XL non- cacheable PFLD 
mode. 


FPFLD#[FPFLDEN]: 


If high, enables the external 
FIFO for i860XL PFLD mode. 


BGT# [C490LDRV]: 


Indicates the driving strength of 
the 82495XP/82490XP 
interface. 


SYNC# [MEMLDRV]: 


Indicates the memory bus 
driving strength. 


SNPCLK# [SNPMD]: 


Indicates the snooping mode; 
synchronous or strobed. 


CFG2-CFG0 


Configure cache parameters 
such as lines/sector, line ratio, 
and number of tags. 
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7.54 SLFTST# 

Self Test 

Executes 82495XP self-test 

Input to 82495XP (pin M2) Test Signal 

Synchronous to CLK 



7.55.3 RELATION TO OTHER SIGNALS 

Address and cycle specification signals (MSETO- 
MSET10, MTAG0-MTAG11, MCFA0-MCFA6, 
CW/R#, CM/IO#, CD/C#, RDYSRC, MCACHE#, 
NENE#, SMLN#, KLOCK#, and CPLOCK#) will be 
valid with CADS #. 



7.54.1 SIGNAL DESCRIPTION 

If SLFTST# is sampled low and HIGHZ# is sam- 
pled high, the 82495XP will perform a self-test after 
reset. The results of the self-tests are given by CA- 
HOLD when FSIOUT# goes inactive. 



7.54.2 WHEN SAMPLED 

SLFTST# is sampled with reset like figure 7-1 with a 
setup time of 10 CPU clocks. SLFTST# is then a 
"don't care" until after the first CADS# activation 
when it becomes the CRDY# pin. 

7.54.3 RELATION TO OTHER SIGNALS 

SLFTST# shares a pin with CRDY#. The 82495XP 
enters self-test if both SLFTST# is sampled active 
and HIGHZ# is sampled inactive. 



7.55 SW1LN# 

Same Line 

Current cycle is same 82495XP line as previous one. 
Output from 82495XP (pin C6) Cycle Control Signal 
Synchronous to CLK 

7.55.1 SIGNAL DESCRIPTION 

SMLN# is used to indicate to the MBC that the cur- 
rent cycle is accessing the same 82495XP cache 
line as the previous cycle. This indication can be 
used by the MBC to selectively activate its 
SNPSTB# signal to other caches in the system. For 
example, back-to-back snoop hits to the same line 
may be snooped only once. 

7.55.2 WHEN DRIVEN 

SMLN# is asserted with CADS# and will stay valid 
until CNA# or CRDY#. 



7.56 SNPADS# 

Cache Snoop Address Strobe 

Initiates a snoop write back cycle 

Output from 82495XP (pin F3) Snooping Signal 

Sync to CLK 

7.56.1 SIGNAL DESCRIPTION 

The SNPADS# signal indicates valid cache control 
and attribute signals, functioning identically to 
CADS#, but is generated only on snoop write- 
backs. The separation of address status signals for 
normal and snoop write-back cycles eases memory 
bus controller implementation. When SNPADS# is 
activated, the memory bus controller should abort all 
pending cycles for which BGT# has not been is- 
sued. The 82495XP reissues these non-committed 
cycles after the snoop write-back has completed. 

7.56.2 WHEN DRIVEN 

SNPADS# is produced when a snoop hits a modi- 
fied line. A modified line condition exists when a line 
in the cache has been updated, and copies of that 
memory location in other devices are no longer val- 
id. A snoop is initiated by the master of a shared bus 
when accessing a memory location on the shared 
bus. 

The response of the 82495XP to a snoop appears 
on the MTHIT# and MHITM# pins in the clock after 
SNPCYC# is active. If these pins are both driven 
low, the snoop resulted in a hit to a modified line, 
and a snoop write-back is initiated with the assertion 
of SNPADS#. SNPADS# is driven, at earliest, two 
clocks after SNPCYC#. Like CADS#, SNPADS# is 
active for one CLK, and is always valid. 

7.56.3 RELATION TO OTHER SIGNALS 

Cycles initiated by SNPADS# require only CRDY#; 
they do not require the other cycle progress signals 
(BGT#, KWEND#, SWEND#). 
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The SNPADS# signal is driven by the 82495XP to 
indicate the start of the write-back cycle; the 
82495XP drives the following address and cycle 
specification signals valid with SNPADS#: CW/R#, 
CD/C#, CM/IO#, MCACHE#, RDYSRC, NENE#, 
SMLN#, and the address on MSET[0:10], 
MTAG[0:11], and MCFA[0:6]. Upon assertion of 
SNPADS#, the memory bus controller should can- 
cel all pending cycles for which BGT# has not yet 
been asserted, because they will be reissued after 
the snoop write-back. The 82495XP will ignore 
BGT# while SNPBSY# and MHITM# are active (ie, 
during the write-back). 

The 82495XP can accept a snoop request while per- 
forming memory bus transfers of its own. If a snoop 
is requested while it is performing a transfer of its 
own, the results of the snoop and any necessary 
snoop write-backs may be delayed. If SNPSTB# is 
sampled at a 82495XP after it has received BGT# 
for its own cycle, and the snoop hits a modified line, 
the snoop write-back will occur after CRDY# for the 
82495XP's own cycle. See Sections 6.2.4 and 6.2.5 
for details. 



7.57.3 RELATION TO OTHER SIGNALS 

After SNPCYC# occurs for a snoop, a new snoop 
may be initiated. If SNPBSY# is asserted for the 
initial snoop, the SNPCYC# of the second snoop is 
delayed until the SNPBSY# signal is deasserted for 
the initial snoop, indicating that its snoop processing 
has completed. 



7.58 SNPCLK [SNPMD] 

Snoop Clock [Snooping Mode] 
Selects 82495XP snooping mode 
Input to 82495XP (pin S3) Snooping Signal 
Synchronous to CLK 

7.58.1 SIGNAL DESCRIPTION 

SNPMD selects whether the 82495XP snoop initia- 
tion be in synchronous, clocked, or strobed mode. 
82495XP snoop response is always synchronous to 
CLK. 



7.57 SNPBSY# 

Snoop Busy 

Indicates additional snoop processing in progress 
Output from 82495XP (pin F1) Snooping Signal 
Sync to CLK 

7.57.1 SIGNAL DESCRIPTION 

SNPBSY# and SNPCYC# indicate a snoop in prog- 
ress. The SNPCYC# signal is asserted on the actual 
snoop look-up to the 82495XP tags. If the snoop 
look-up indicates a valid line is hit and the snoop is 
invalidating, the 82495XP must perform a back inval- 
idation on the CPU. If a snoop hit occurs to a modi- 
fied line, a snoop write-back must occur. SNPBSY# 
is asserted and remains active while either a back 
invalidation or a snoop write-back is in progress. 

7.57.2 WHEN DRIVEN 

SNPBSY# is activated for two conditions. First, 
SNPBSY# is activated whenever a back invalidation 
is necessary: the snoop returns MTHIT# active and 
SNPINV was asserted on the snoop initiation. Sec- 
ond, SNPBSY# is activated when a modified cache 
line is hit on a snoop, as indicated by MHITM#, until 
the modified line has been written back (CRDY# re- 
turned for the write-back). 

SNPBSY# is valid in the CLK following SNPCYC#, 
and if active, remains active for a minimum of two 
CLKS. 



Synchronous mode (to CLK) is selected by SNPMD 
sampled low during reset. Strobed mode is selcted 
by SNPMD sampled high during reset. Clocked 
mode is selected by connecting the snoop clock 
source to SNPMD, and thus SNPMD becomes the 
actual snoop clock (SNPCLK). 

7.58.2 WHEN SAMPLED 

SNPMD is sampled like figure 7-1 with a setup time 
of 4 CPU clocks. SNPMD is then not used unless 
clocked mode is being selected. If clocked mode is 
selected, SNPMD becomes SNPCLK to clock in 
snoop requests. 

7.58.3 RELATION TO OTHER SIGNALS 

SNPMD becomes SNPCLK if a clock signal is de- 
tected at reset. In this clocked mode, SNPCLK is 
then used to clock-in SNPSTB#, the snoop ad- 
dress, and all snoop attributes. 



7.59 SNPCYC* 

Snoop Cycle 

Indicates snoop look-up occurring in 82495XP tags 
Output from 82495XP (pin H3) Snooping Signal 
Sync to CLK 
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7.59.1 SIGNAL DESCRIPTION 

SNPCYC# is asserted by the 82495XP during the 
clock when the actual tag look-up for the snoop is 
performed. SNPCYC# may appear as early as the 
CLK following SNPSTB# assertion, or may be de- 
layed several clocks while a snoop write-back or 
82495XP memory bus cycle take place. 



7.59.2 WHEN DRIVEN 

SNPCYC# is always a valid 82495XP output. It is 
asserted once, for a single clock, for every snoop 
which is initiated in the 82495XP. 



7.59.3 RELATION TO OTHER SIGNALS 

A snoop is initiated by assertion of the SNPSTB# 
input if MAOE# is not asserted. The actual snoop, 
signalled by the assertion of SNPCYC#, can be de- 
layed by a prior snoop's write-back in progress 
(SNPBSY# asserted) or by a 82495XP memory cy- 
cle in progress (SNPSTB# occurs after BGT#)— 
see SNPSTB# for details. If neither of these is oc- 
curring, strobed and clocked snooping modes can 
also delay snoop look-up for a clock while the snoop 
address and attributes are synchronized. 

In the clock following SNPCYC#, MHITM# and 
MTHIT# report valid snoop results. 



7.60 SNPINV 

Snoop Invalidation 

Forces invalidation of snoop hits 

Input to 82495XP (pin P5) Snooping Signal 

Sampled with SNPSTB# (see SNPSTB#) 

7.60.1 SIGNAL DESCRIPTION 

Assertion of the SNPINV signal during the initiation 
of a snoop request forces a snoop hit for that re- 
quest into the Invalid state. 

The SNPINV pin is sampled upon initiation of a 
snoop request with SNPSTB# activation, depending 
on snooping mode: rising edge of first CLK when 
SNPSTB is asserted (synchronous snooping mode), 
or rising edge of first SNPCLK when SNPSTB# is 
asserted (clocked mode), or falling edge of strobed 
SNPSTB # (strobed mode). 



7.60.2 WHEN SAMPLED 

When a bus master performs a bus access, the 
SNPSTB # of all other 82495XPs is asserted to initi- 
ate a snoop for that address. If the master's access 
is one which is modifying the data (a write to memo- 
ry, etc.), the SNPINV pin of all snooping 82495XPs 
must be asserted during SNPSTB # so that the line 
is properly marked Invalid. 

SNPINV is not asserted during SNPSTB # assertion 
if snoop hits are to remain valid: the master issuing 
the snoop does not require their invalidation (a 
read). 

SNPINV assertion forces all snoop hits to be invali- 
dated, overriding other inputs or attributes (ie 
SNPNCA). When SNPINV is not asserted, cache 
states change according to normal protocol. 

SNPINV is only sampled with SNPSTB #, which may 
be qualified by CLK or SNPCLK depending on the 
snooping mode, and must meet setup and hold 
times for the edge of its sampling. When SNPSTB # 
is not being asserted, SNPINV is a don't care and 
need not follow setup and hold times. 

7.60.3 RELATION TO OTHER SIGNALS 

SNPINV is sampled according to SNPSTB #, which 
may be qualified by SNPCLK or CLK, depending on 
the snooping mode. SNPINV overrides the SNPNCA 
input, which may also be asserted with SNPSTB #. If 
MAOE# is active with SNPSTB # sampling, the 
snoop request is ignored. 



7.61 SNPNCA 

Snoop Non Caching device Access 

Indicates to snooping 82495XP that the initiating 
master is a non- caching device 

Input to 82495XP (pin Q3) Snooping Signal 

Sampled with SNPSTB# (see SNPSTB#) 

7.61.1 SIGNAL DESCRIPTION 

SNPNCA indicates that the master which is initiating 
the snoop request will not cache the data. If the 
SNPNCA pin is not asserted and the snoop is nonin- 
validating (where noninvalidating = SNPINV not as- 
serted), a snoop hit line must be placed in the 
Shared state, since the data will exist in another 
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cache. If SNPNCA is asserted and the snoop is non- 
invalidating, a snoop hit line will not be entered into a 
new cache, so a hit Exclusive or Modified line will be 
placed in the Exclusive state by the 82495XP. A 
noninvalidating snoop hit to a Shared line must keep 
the hit line in the Shared state, regardless of 
SNPNCA. 

SNPNCA is sampled upon initiation of a snoop re- 
quest with SNPSTB# activation, depending on the 
snooping mode: rising edge of first CLK when 
SNPSTB# asserted (synchronous snooping mode), 
or the rising edge of SNPCLK when SNPSTB# is 
asserted (clocked snooping mode), or the falling 
edge of SNPSTB# (strobed snooping mode). 

7.61.2 WHEN SAMPLED 

To achieve maximum processor performance and 
minimum bus traffic, SNPNCA shouid be asserted 
when the noninvalidating snoop is caused by an ac- 
cess from a non-caching device like a DMA. 

If the snoop is being caused by a device which will 
also be caching the data, SNPNCA must not be as- 
serted, so that the 82495XP does not leave the hit 
line in an Exclusive state — subsequent writes to 
lines in this state do not appear on the bus, and stale 
data would result in the cache which incorrectly as- 
serted SNPNCA. 

If SNPNCA is asserted on a noninvalidating snoop 
request, the following outlines the behavior of the 
cache for a snoop hit in each of the MESI states: 

Modified The data is written to the bus, and the 
line is placed in the Exclusive state 

The line remains in the Exclusive state 

The line remains in the Shared state 



Exclusive 

Shared 

Invalid 



This is a cache miss. The line remains 
Invalid. 



If SNPNCA is NOT asserted on a noninvalidating 
snoop request, an M, E, or S state hit line will be 
placed in the Shared state. Again, M state causes a 
write to the bus, Invalid lines remain Invalid. 

SNPNCA is only sampled with $NPSTB#, which 
may be qualified by CLK or SNPCLK depending on 
the snooping mode, and must meet setup and hold 
times for the edge of this sampling. When 
SNPSTB# is hot being sampled, SNPNCA is a don't 
care and need not follow set-up and hold times. 

7.61.3 RELATION TO OTHER SIGNALS 

SNPNCA is sampled with SNPSTB#, which may be 
qualified by SNPCLK or CLK, depending on snoop- 
ing mode. The assertion of SNPINV overrides 



SNPNCA, and places all snoop hit lines into the In- 
valid state. If MAOE# is active on SNPSTB# sam- 
pling, the snoop request is ignored. 



7.62 SNPSTB# 

Snoop Strobe 

Initiates 82495XP snoop and latches snoop address 
& attributes 

Input to 82495XP (pin R3) Snooping Signal 

Sync to CLK or SNPCLK, or strobed 

7.62.1 SIGNAL DESCRIPTION 

Snoop strobe initiates a 82495XP snoop request. It 
controls the latching of the snoop address and 
snoop attribute signals, in the manner specified by 
one of three snooping modes: 

Snooping Modes 



Mode 


Snoop Address/ 
Attributes Sampled on: 


Strobed 
Clocked 

Synchronous 


falling edge of SNPSTB# 

rising edge of SNPCLK when 
SNPSTB # sampled active 

rising edge of CLK when 
SNPSTB # sampled 



SNPSTB # must be asserted to initiate a snoop re- 
quest. Snoops are initiated by a bus master for all 
memory accesses, to ensure that data residing in 
other caches is flushed if modified and invalidated if 
necessary. 

SNPSTB # must be deasserted for at least one 
SNPCLK or CLK when clocked or synchronous 
snooping mode (respectively) is used, in order to 
rearm for the next snoop. 

SNPSTB # can be asserted while a snoop is in prog- 
ress, allowing one level of pipelining. However, the 
reassertion of SNPSTB # while snooping is in prog- 
ress must not occur until after SNPCYC#— -precise- 
ly, after the falling edge of SNPCYC# for strobed 
and clocked modes, or in the clock after SNPCYC# 
is active for synchronous mode. SNPSTB# must not 
be asserted between the first and last BGT# of a 
locked sequence. Similarly, SNPSTB # must not oc- 
cur after the BGT# of the write through and before 
the BGT# of the allocation when a Read-for-Owner- 
ship transaction is occurring. 

SNPSTB # itself does not affect the cache contents 
or states, but the snoop signals SNPINV and 
SNPNCA, latched upon SNPSTB #, force various 
changes in the cache on a snoop hit. 
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7.62.2 WHEN SAMPLED 

SNPSTB# is sampled on every SNPCLK or CLK in 
clocked or synchronous modes, and is sampled con- 
stantly in strobed mode. While a snoop is in prog- 
ress, a new SNPSTB# is recognized as a new, pos- 
sibly pipelined, snoop request. After the assertion of 
a pipelined SNPSTB#, the SNPSTB# signal must 
not be reasserted until after the next SNPCYC#. 

SNPSTB# should always meet proper set-up and 
hold times when operating in clocked or synchro- 
nous modes. When operating in strobed mode, it 
must meet minimum active/inactive times to be 
properly recognized in the next clock. 

7.62.3 RELATION TO OTHER SIGNALS 

SNPSTB# latches the following signals: SNPINV, 
SNPNCA, MBAOE#, and MAOE#, and the address 
on the MSET, MTAG, and MCFA pins. The address 
which appears on the MSET, MTAG, and MCFA ad- 
dress pins is to be snooped in the 82495XP. 
MAOE# acts as a qualifier for a snoop; if MAOE# is 
active when sampled on a SNPSTB# assertion, the 
snoop request is ignored. SNPINV and SNPNCA 
provide the 82495XP with snoop attributes which af- 
fect the state of a snoop hit cache entry. 

If MBAOE# is active during SNPSTB# assertion, 
the 82495XP forces all bits in the subline address 
(those address bits which MBAOE# controls) to 
on a snoop write back for that snoop. 

Snoops and memory accesses are interlocked, such 
that after BGT# for a memory access has been is- 
sued, a SNPSTB# which is asserted will be latched, 
with its address and attributes, but will not cause a 
snoop until after SWEND# for that memory cycle. 
After BGT# has been issued for a cycle, snoop 
write-backs are delayed until after the CRDY# for 
that cycle. Likewise, once a snoop is underway 
(SNPCYC# active) BGT# is ignored until snoop 
completion. 

SNPSTB# must not be deasserted and reasserted 
(specifically, cause a second falling edge) between 
its initial recognition and SNPCYC#— ie, SNPSTB# 
must not be asserted before the SNPCYC# of the 
previous SNPSTB#. In strobed and clocked modes, 
SNPSTB# can be reasserted after the falling edge 
of SNPCYC#; in synchronous mode, SNPSTB# can 
be reasserted in the CLK after SNPCYC# is active. 
This second assertion of SNPSTB#, after 
SNPCYC#, can occur while the first snoop is still 
progressing (SNPBSY# is active), allowing one level 
of snoop pipelining. In this case, a third assertion of 
SNPSTB# must not occur until after the SNPCYC# 
for the second, piped snoop request. 



SNPSTB# must not be asserted while the 82495XP 
is executing a locked sequence (LOCK# active). 
Specifically, SNPSTB# must not be asserted after 
the BGT# for the first locked access and before the 
BGT# of the last locked access. 

Systems which support Read-for-Ownership must 
not assert SNPSTB# between the BGT# of the 
write through and the BGT# of the allocation during 
a Read-for-Ownership operation. 



7.63 SWEND# 

Snoop Window End 

Closes Snooping Window 

Input to 82495XP (pin Q1) Cycle Progress Signal 

Synchronous to CLK 

7.63.1 SIGNAL DESCRIPTION 

SWEND# is an input to the 82495XP that, when 
asserted, closes the snooping window and causes 
sampling of MWB/WT# and DRCTM#. Once 
snooping of all other 82495XP's is complete, 
DRCTM# and MWB/WT# can be determined. 

Snoop response is blocked by the 82495XP be- 
tween BGT# and SWEND# activation. Therefore, 
the faster SWEND# is closed, faster snoops can be 
determined. 

All CPU-generated write cycles and cache read miss 
cycles must cause a snoop on the memory bus. 
SWEND# may be activated once snooping has 
completed for these cycles. SWEND# activation 
causes the 82495XP's internal tags to change state 
for the current cycle (if necessary). DRCTM# and 
MWB/WT# influence the state change decision. 

SWEND# need only be activated for those cycles 
which require the sampling of DRCTM# and 
MWB/WT#. 

If a cycle does not specifically require SWEND#, 
and SWEND# is not returned, snooping is blocked 
from BGT# to CRDY#. For this reason, it may be 
more efficient to always return SWEND#. 

7.63.2 WHEN SAMPLED 

SWEND# is sampled by the 82495XP on the clock 
or after KWEND# is sampled active for those cycles 
that sample KWEND#. For cycles that do not sam- 




2-325 



82495XP Cache Controller/82490XP Cache RAM 



[p^iyiMDKi^v 



pie KWEND#, SWEND# is sampled with or after 
BGT#. Once SWEND# is sampled active, it is ig- 
nored until KWEND# of the next cycle. If SWEND# 
is not being sampled, it may violate setup and hold 
times. 

Snoop response is blocked between BGT# and 
SWEND#. If a snoop is initiated between BGT# 
and SWEND#, the MTHIT# and MHITM# re- 
sponse is given after SWEND# activation. Any sub- 
sequent snoop write back would begin after 
CRDY#. 



7.63.3 RELATION TO OTHER SIGNALS 

SWEND# causes the sampling of MWB/WT# and 
DRCTM#. SWEND# is sampled once KWEND# is 
sampled active. BGT#, KWEND#, and SWEND# 
may be asserted in the same clock. 

SWEND# shares a pin with CFG1 . 



7.64 SYNC# 

Sync 

Synchronizes 82495XP TAG array with Main Memo- 
ry 

Input to 82495XP (Q4) Cache Synchronization Sig- 
nal 

Asynchronous 

7.64.1 SIGNAL DESCRIPTION 

SYNC# activation will cause the synchronization of 
the 82495XP and i860 XP CPU tag arrays with main 
memory. The 82495XP will flush all modified entries 
to memory. All valid tag entries will be kept, with 
modified [M] state lines becoming non-modified [E] 
state lines. 



7.64.2 WHEN SAMPLED 

SYNC# can be asserted at any time. The 82495XP 
will complete all outstanding cycles on the CPU and 
memory bus before beginning the SYNC process. 
The memory bus controller does not have to prevent 
SYNC# during locked cycles because the 82495XP 
will complete its locked cycle before the SYNC pro- 
cess will begin. 

Once a SYNC operation has begun, the SYNC# sig- 
nal is ignored until the operation completes. If 
RESET or FLUSH # is asserted while the SYNC op- 
eration is in progress, the SYNC operation will be 
aborted and the RESET or FLUSH immediately exe- 
cuted. 



SYNC# is an asynchronous input. SYNC# must 
have a pulse width of 2 CLK's in order to guarantee 
82495XP recognition. 

7.64.3 RELATION TO OTHER SIGNALS 

To initiate a SYNC, the 82495XP will complete all 
pending cycles and prohibit further ADS #'s to occur 
while a SYNC is in progress. The FSIOUT# output 
signal is used to indicate the start and end of the 
SYNC operation. It will become active when the 
SYNC# signal is internally recognized (all outstand- 
ing cycles have completed) and will de-activate 
when the SYNC operation has completed. 

The memory bus controller supplies BRDY# to the 
CPU once the SYNC has completed. Once SYNC 
has begun, and FSIOUT# active, all CADS#'s and 
CRDY#'s correspond to the write-backs caused by 



The 82495XP can be snooped during SYNC cycles 
and the snooping protocols will be the same as that 
for any memory bus cycle. 



7.65 TCK 

Test Clock 

Clock for the JTAG boundary scan tests 

Input to the i860 XP CPU (pin Q1) Test Signal 

Input to the 82495XP (pin P3) 

Input to the 82490XP (pin 3) 

Synchronous 

7.65.1 SIGNAL DESCRIPTION 

TCK is an input to the i860 XP CPU, 82495XP and 
82490XP and provides the clocking function re- 
quired by the JTAG boundary scan feature. TCK is 
used to clock state information and data into and out 
of the component. State select information and data 
are clocked into the component on the rising edge 
of TCK on TMS and TDI, respectively. Data is 
clocked out of the part on the falling edge of TCK on 
TDO. 

In addition to using TCK as a free running clock, it 
may be stopped in a low, logic 0, state, indefinitely 
as described in IEEE 1149.1. While TCK is stopped 
in the low state, the boundary scan latches retain 
their state. 

When boundary scan is not used, TCK should be 
tied low. 
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7.65.2 WHEN SAMPLED 

TCK is a clock signal and is used as a reference for 
sampling other JTAG signals. 



7.65.3 RELATION TO OTHER SIGNALS 

On the rising edge of TCK, TMS and TDI are sam- 
pled. On the falling edge of TCK, RDO is driven. 



7.67 TDO 

Test Data Output 

Outputs serial test instructions and data 

Output from the i860 XP CPU (pin R10) Test Signal 

Output from the 82495XP (pin C4) 

Output from the 82490XP (pin 84) 

Synchronous to TCK 



7.66 TDI 

Test Data Input 

Receives serial test instructions and data 

Input to the i860 XP CPU (pin S14) Test Signal 

Input to the 82495XP (pin N3) 

Input to the 82490XP (pin 2) 

Synchronous to TCK 

7.66.1 SIGNAL DESCRIPTION 

TDI is the serial input used to shift JTAG instructions 
and data into the component. The shifting of instruc- 
tions and data occurs during the SHIFT-IR and 
SHIFT- DR TAP controller states, respectively. 
These states are selected using the TMS signal as 
described in chapter 9. 

An internal pull up resistor is provided on TDI to en- 
sure a known logic state if an open circuit occurs on 
the TDI path. Note than when "1" is continuously 
shifted into the instruction register, the BYPASS in- 
struction is selected. 

7.66.2 WHEN SAMPLED 

TDI is sampled on the rising edge of TCK, during the 
SHIFT-IR and the SHIFT-DR states. During all other 
TAP controller states, TDI is a "don't care". 

7.66.3 RELATION TO OTHER SIGNALS 

TDI is only sampled when TMS and TCK have been 
used to select the SHIFT-IR or SHIFT-DR states in 
the TAP controller. 

For proper initialization of JTAG logic, TDI should be 
driven high, "1", for at least four TCK cycles follow- 
ing the rising edge of RESET. 



7.67.1 SIGNAL DESCRIPTION 

TDO is the serial output used to shift JTAG instruc- 
tions and data out of the component. The shifting of 
instructions and data occurs during the SHIFT-IR 
and SHIFT- DR TAP controller states, respectively. 
These states are selected using the TMS signal as 
described in chapter 9. 

When not in SHIFT-IR or SHIFT-DR state, TDO is 
driven to a high impedance state to allow connecting 
TDO of different devices in parallel. 

7.67.2 

TDO is driven on the falling edge of TCK during the 
SHIFT-IR and SHIFT- DR TAP controller states. At 
all other times TDO is driven to the high impedance 
state. 



7.67.3 

TDO is only driven when TMS and TCK have been 
used to select the SHIFT- IR or SHIFT-DR states in 
the TAP controller. 



7.68 TMS 

Test Mode Select 

Controls testing by selecting mode of operation 

Input to the i860 XP CPU Test Signal 

Input to the 82495XP (pin P2) 

Input to the 82490XP (pin 1) 

Synchronous to TCK 

7.68.1 SIGNAL DESCRIPTION 

TMS is decoded by the JTAG TAP (Tap Access 
Port) to select the operation of the test logic, as de- 
scribed in chapter 9. 
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To guarantee deterministic behavior of the TAP con- 
troller TMS is provided with an internal pull-up resis- 
tor. If boundary scan is not used, TMS may be tied 
high or left unconnected. 

7.68.2 WHEN SAMPLED 

TMS is sampled on every rising edge of TCK. 

7.68.3 RELATION TO OTHER SIGNALS 

TMS is used to select the internal TAP states re- 
quired to load boundary scan instructions to data on 
TDI. 

For proper initialization of the JTAG logic, TMS 
should be driven high, "1", for at least four TCK cy- 
cles following the rising edge of RESET. 

7.69 VccandVss 

Power and Ground Pins 

See Tables 1.1 and 1.2 for locations. 



7.70 WWOR# 

Weak Write Ordering Mode 
Enforces strong/weak write-ordering policy 
Input to 82495XP (pin Q2) Configuration Signal 
Synchronous to CLK 

7.70.1 SIGNAL DESCRIPTION 

When asserted during reset, the 82495XP enforces 
a weak write ordering policy. If WWOR# is deassert- 
ed during reset, the 82495XP enforces a strong 
write-ordering policy. 

In a strong write-ordering mode, writes to the memo- 
ry bus are forced to occur in the order in which they 
were posted by the CPU. In a weak write-ordering 
mode it is possible for: 

1. A CPU posted write (A) to be waiting in a 
82495XP/82490XP memory buffer. 

2. A subsequent CPU write (B) to complete in the 
82495XP/82490XP because it was a hit to M or E 
state. 



3. A snoop hit to B to cause a write back of B before 
A is written. 

In this scenario, B is written to memory before A is, 
and thus CPU writes have been reordered. 



7.70.2 WHEN SAMPLED 

WWOR# is sampled during reset like figure 7-1 with 
a setup time of 4 CPU clocks. WWOR# becomes 
MALE once FSIOUT# indicates that the 82495XP 
reset sequence has completed. 



7.70.3 RELATION TO OTHER SIGNALS 

WWOR# shares a pin with MALE. 

8.0 BUS FUNCTIONAL DESCRIPTION 
AND TiMiNG 

The 82495XP/82490XP cache core supports a wide 
variety of bus transfers to meet the needs of high 
performance systems. Bus transfers can be single 
cycle or multiple cycle, cacheable or non-cacheable, 
64- or 128-bit (memory bus), and locked. To support 
multiprocessing systems there are cache back-inval- 
idation, inquire, snooping, read for ownership, cache 
to cache transfers, and locked cycles. 

This section begins with read cycles, both cacheable 
and non-cacheable. It moves on to write cycles, 
cacheable and non-cacheable. Snooping cycles are 
discussed next with an example of each snooping 
mode. The remaining sections describe special cy- 
cles: read for ownership, I/O, and locked cycles. 

The cycles shown in this chapter are examples of 
various types of 82495XP/82490XP cycles. The pur- 
pose of these examples is to show signal relation- 
ships, and are not necessarily best case scenarios. 



8.1 Read Cycles 

8.1.1 READ HITS 

Read Hit cycles are executed completely within the 
CPU/Cache core, and will not be seen by the MBC: 



2-328 



intel. 



82495XP Cache Controller/82490XP Cache RAM 



pisoimagw 



! 1 ', 2 i 3 ; 4 i 5 ; 6 ; i \ 8 ; 9 ; 10 ; n \ 12 ; 13 1 14 ; 15 1 16 ; 17 ; 18 ; 19 ; 20 ; 21 



cads# ; \~\Lf~ 



ADDRESS W$$£ 
CW/R# 



rdysrc xxxxxx/ : : 



cna# toxtooofox/ : \ 



"\y 



sttxM^^ i 



\mx)wo^^ 



/yfitt^^ ; 



bgt# w^^j^m^^^^^^^^^^^m^ 



j^^^^^^^^^^^^m±±/^^^^^^m 



Jmmmmmwm 



/>6o<yy!>ooo6ooo6oo<x!>o<^ 



xxxxxxxxxxxxxxxxxxxxxx 



/OOQOOOQOOOCOOOQOOOeO(XX' 



kwend# xxxxxxx)4xx/ ; r \ : /xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxtoxx)( 7T~ \ : /xxxxxxxxxxxxxxxxxx 
mken# yy)o < Jm)p)oo(X)^ : /xflooooooo^^ : /x^ooooooooooooo^ 

swend# OTS^fflE^ j~ \ I /tt$ooo$ooo^^ hoOTOT : 




240956-33 



Figure 8-1. Cacheable Read Miss with Clean Replacement 
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8.1.2 CACHEABLE READ MISSES 

8.1.2.1 Read Miss with Clean Replacement 

Figure 8.1 Illustrates CPU Initiated Read cycles that 
miss the 82495XP/82490XP cache and replace a 
non-dirty (eg. clean or empty) line in the cache. In 
such cycles, the 82495XP will instruct the MBC to 
perform a cache line-fill cycle on the memory bus. A 
cache line-fill is a read of a complete 
82495XP/82490XP line from main memory. The line 
is then written into the 82490XP's array, and data 
transferred to the CPU as requested. If the line 
fetched from main memory replaces . a 
82495XP/82490XP cache line which is in valid un- 
modified state ([E] or [S]), then a back-invalidation 
cycle is performed on the CPU bus to guarantee that 
the replaced data is also removed from the CPU's 
first level cache, thus maintaining the inclusion prop- 
erty. 

CACHE CONTROL SIGNALS: 

The CPU initiates the read cycle to the 
82495XP/82490XP cache where the cache tag 
state is looked up. Once the 82495XP determines 
the cycle to be a cache miss, it issues CADS# 
(clock 2) and the associated cycle control signals to 
the MBC (eg. CW/R#, CM/IO#, CD/C#, RDYSRC, 
MCACHE#) in order to schedule the cache line-fill 
operation. MCACHE# is active, indicating that the 
read miss is potentially cacheable by the 82495XP; 
RDYSRC is active, indicating that the MBC must 
supply BRDY#s to the CPU cache core. 

The memory bus address (MSET[10:0], 
MTAG[11:0], MCFA[6:0]) is valid with CADS# 
(clocks 2 and 1 3 for the two cycles in this example) 
and remain valid until after CNA# is sampled active 
by the 82495XP (clocks 5 and 16). MALE and MBA- 
LE may be used to hold the address as necessary. 

The MBC arbitrates for the memory bus and returns 
BGT# asserted (clock 3), indicating that the cycle is 
guaranteed to complete on the memory bus. Once 
the 82495XP samples BGT# asserted, it must finish 
that cycle on the memory bus. Prior to this point, the 
cycle can be aborted by a snoop hit in the cache. 

CNA# is asserted by the MBC (clock 4) to indicate 
that it is ready to schedule a new memory bus cycle. 
Note that after CNA# activation, cycle control sig- 
nals are not guaranteed to be valid. 

When the MBC has determined the cacheability at- 
tribute of the cycle, it drives the MKEN# signal ac- 
cordingly. The MBC also drives the KWEND# signal 



at this time, indicating the end of the cacheability 
window. The 82495XP samples MKEN# during 
KWEND# (clock 5) to determine that the cycle is 
indeed cacheable. 

The MBC asserts SWEND# when the snoop win- 
dow ends on the memory bus. The 82495XP sam- 
ples MWB/WT# and DRCTM# during SWEND# 
(clock 7) and updates the cache tag state according 
to the consistency protocol. The closure of the 
snoop window also enables the MBC to start provid- 
ing the CPU with data that has been stored in the 
82490XP's memory cycle buffer. The MBC supplies 
BRDY#s to the CPU (clocks 7-10). 

The first cycle ends when CRDY# is driven active 
by the MBC (clock 10). It is at this time that the data 
in the 82490XP's memory cycle buffers is loaded 
into the cache SRAM. 

The 82495XP issues a new CADS# in clock 13, 
which also misses the 82495XP/82490XP cache. 
Note that once the cycle progress signals (BGT#, 
CNA#, KWEND#, SWEND#) of a cycle are sam- 
pled asserted, the 82495XP ignores them until the 
CRDY# of that cycle. The 82495XP does not pipe- 
line the cycle progress signals (ie. it will not sample 
them again until after CRDY# of the current memory 
bus cycle). 

MEMORY BUS SIGNALS: 

The memory address latch enables (MALE and 
MBALE) may remain asserted by the MBC to place 
the address latches in flow through mode. If the 
82495XP is the current bus master, the memory ad- 
dress output enables (MAOE# and MBAOE#) 
should be asserted by the MBC. MDQE# must be 
inactive to allow the data pins to be used as inputs. 

Some time after the address has been driven onto 
the memory bus, data will be supplied from the 
DRAM (main memory) to the 82490XP cache 
SRAM. 

For Clocked Memory Bus Mode, MSEL# is driven 
active by the MBC (clock 4) to allow sampling of 
MBRDY# and to latch MZBT# for the transfer. 
MZBT# is sampled on all MCLK edges where 
MSEL# is inactive. Once MSEL# is sampled active 
by the 82495XP, the value of MZBT# sampled on 
the prior MCLK is used for the next transfer. 
MBRDY# is driven active by the MBC in clocks 4 to 
6 to cause the memory burst counter to be incre- 
mented and data to be placed into the 82490XP 
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cache memory cycle buffers. The MBC drives 
MEOC# asserted (clock 7) to end the current cycle 
on the memory bus and switch memory cycle buffers 
for the new cycle. MZBT# is latched at this time 
(when MEOC# is sampled asserted and MSEL# re- 
mains low) for the next transfer. 

MBRDY# is driven active by the MBC in clocks 15 
to 17 to read data into the 82490XP cache memory 
cycle buffers. The MBC asserts MEOC# (clock 18) 
to end the second read miss cycle on the memory 
bus and switch the memory cycle buffers for a new 
cycle. 

For Strobed Memory Bus Mode, MSEL# is driven 
active by the MBC (clock 4) to allow MISTB opera- 
tion and to latch MZBT# (on the falling edge of 
MSEL#) for the transfer. MISTB is toggled in clocks 
5 to 7 to cause the memory burst counter to be in- 
cremented, and data to be placed into the 82490XP 
cache memory cycle buffers. Note: MISTB latches 
the memory bus data on both the rising and falling 
edges. The MBC drives MEOC# asserted (clock 8) 
to end the current cycle on the memory bus and 
switch memory cycle buffers for the new cycle. 
MZBT# for the next cycle, is sampled at this time on 
the falling edge of MEOC#. 

MISTB is toggled by the MBC (clocks 15 to 17) to 
read data into the 82490XP memory cycle buffers. 
The MBC asserts MEOC# (clock 18) to end the sec- 
ond read miss cycle on the memory bus and switch 
the memory cycle buffers for a new cycle. 

8.1.2.2 Read Miss with Replacement of Dirty 
Line 

Figure 8.2 illustrates a CPU read cycle which misses 
the 82495XP cache, and requires the replacement 
of a modified line (eg. tag replacement, lines/ 
sector=1 line ratio=1). In such cycles, the 
82495XP will instruct the MBC to perform a cache 
line-fill on the memory bus, instruct the 82490XP to 
fill its write-back buffer with the contents of the array 
location corresponding to the line which must be re- 
placed, and perform a back invalidation to the CPU 
to maintain the first and second level cache consist- 
ency. Once the cache line-fill has completed, the 
82495XP/82490XP will write back the contents of 
the replaced line to main memory from the 82490XP 
write-back buffer. 

CACHE CONTROL SIGNALS: 

The CPU initiates the read cycle to the 
82495XP/82490XP cache where the cache tag 
state is looked up. Once the 82495XP determines 
the cycle to be a cache miss, it issues CADS# 
(clock 1) and the associated cycle control signals to 



the MBC (eg. CW/R#, CM/IO#, CD/C#, RDYSRC, 
MCACHE#) in order to schedule the cache line-fill 
operation. MCACHE# is active, indicating that the 
read miss is potentially cacheable by the 82495XP; 
RDYSRC is active, indicating that the MBC must 
supply BRDY#s to the CPU cache core. 

The memory bus address (MSET[10:0], 
MTAG[11:0], MCFA[6:0]) is valid with CADS# 
(clocks 1 and 5 for the two cycles in this example) 
and remain valid until after CNA# is sampled active 
by the 82495XP (clocks 4 and 10). MALE and MBA- 
LE may be used to hold the address as necessary. 

The MBC arbitrates for the memory bus and returns 
BGT# asserted (clock 2), indicating that the cycle is 
guaranteed to complete on the memory bus. At this 
point, the 82490XP's write-back buffer is prefilled 
with the line to be replaced. Once the 82495XP sam- 
ples BGT# asserted, it must finish that cycle on the 
memory bus. Prior to this point, the cycle can be 
aborted by a snoop hit from another cache. 

CNA# is asserted by the MBC (clock 3) to indicate 
that it is ready to schedule a new memory bus cycle. 
Note that after CNA# activation, cycle control sig- 
nals are not guaranteed to be valid. 

When the MBC has determined the cacheability at- 
tribute of the cycle, it drives the MKEN# signal ac- 
cordingly. The MBC also drives the KWEND# signal 
at this time, indicating the end of the cacheability 
window. The 82495XP samples MKEN# during 
KWEND# (clock 4) to determine that the cycle is 
indeed cacheable. 

The MBC asserts SWEND# (clock 6) when the 
snoop window ends on the memory bus. The clo- 
sure of the snoop window enables the MBC to start 
providing the CPU with data that has been stored in 
the 82490XP's memory cycle buffer. The MBC sup- 
plies BRDY#s to the CPU (clocks 6-9) to serve the 
read cycle. Note that data may be supplied to the 
82490XP's immediately after MSEL# activation, and 
need not wait for SWEND#. 

On the memory bus, the 82495XP issues a write- 
back (WB) cycle. CNA# is sampled active in clock 3 
causing the 82495XP to issue the CADS# (also 
CDTS#) of the write-back (clock 5). The MBC 
knows this is a write back cycle and not a CPU initia- 
ted write cycle by sampling MCACHE# asserted. 
This tells the MBC how many data transfers are nec- 
essary. 

BGT#, CNA#, and KWEND# of the write-back are 
sampled asserted by the MBC (clock 9) after the 
CRDY# of the read miss cycle (clock 8). At this 
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Figure 8-2. Cacheable Read Miss with Replacement of Dirty Line 
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point, the 82495XP may issue another CADS# for a 
new (unrelated) memory bus cycle. It is at this time 
that the data in the 82490XP's memory cycle buffers 
is loaded into the cache SRAM. The data to be writ- 
ten back to main memory is in the 82490XP's write 
back buffers. 

The snoop window for the write back cycle is closed 
by the MBC in clock 11, and the cycle is ended by 
CRDY# sampled asserted in clock 13. 

MEMORY BUS SIGNALS: 

The memory address latch enables (MALE and 
MBALE) may remain asserted by the MBC to place 
the address latches in flow through mode. If the 
82495XP is the current bus master, the memory ad- 
dress output enables (MAOE# and MBAOE#) 
should be asserted by the MBC. 

Some time after the address has been driven onto 
the memory bus, data will be supplied from the 
DRAM (main memory) to the 82490XP cache 
SRAM. 

For Clocked Memory Bus Mode, MSEL# is driven 
active by the MBC (clock 3) to allow sampling of 
MBRDY# and to latch MZBT# for the transfer. 
MZBT# is sampled on all MCLK edges where 
MSEL# is inactive. Once MSEL# is sampled active 
by the 82495XP, the value of MZBT# sampled on 
the prior MCLK is used for the next transfer. 
MBRDY# is driven active by the MBC in clocks 3 to 
5 to cause the memory burst counter to be incre- 
mented and data to be placed into the 82490XP 
cache memory cycle buffers. The MBC drives 
MEOC# asserted (clock 6) to end the current cycle 
on the memory bus and switch memory cycle buffers 
for the new cycle. MZBT# is latched at this time 
(when MEOC# is sampled asserted) for the next 
transfer. 

The MBC asserts the memory data output enable 
signal (MDOE#, clock 8) to drive the memory data 
outputs. 

MBRDY# is driven active by the MBC in clocks 10 
to 12 to write data from the 82490XP cache memory 
cycle buffers onto the memory bus. The MBC as- 
serts MEOC# (clock13) to end the write back cycle 
on the memory bus and switch the memory cycle 
buffers for a new cycle. 

For Strobed Memory Bus Mode, MSEL# is driven 
active by the MBC (clock 4) to allow MISTB opera- 
tion and to latch MZBT# for the transfer (on 
MSEL# falling edge). MISTB is toggled in clocks 5 
to 7 to cause the memory burst counter to be incre- 



mented, and data to be placed into the 82490XP 
cachememory cycle buffers. Note: MISTB latches 
the memory bus data on both the rising and falling 
edges. The MBC drives MEOC# asserted (clock 8) 
to end the current cycle on the memory bus and 
switch memory cycle buffers for the new cycle. 
MZBT# for the next cycle, is latched at this time on 
the falling edge of MEOC#. 

The MBC asserts MDOE# (clock 9) to drive the 
memory data outputs. 

MOSTB is toggled by the MBC (clocks 10 to 12) to 
write data from the 82490XP memory cycle buffers 
onto the memory bus. The MBC asserts MEOC# 
(clock 1 3) to end the write back cycle on the memo- 
ry bus and switch the memory cycle buffers for a 
new cycle. 

8.1.3 NON-CACHEABLE READ MISSES 

8.1.3.1 Read Misses not Cacheable by CPU/ 

Cache Core and Cacheable by Core, but 
not by Memory Bus 

Figure 8.3 illustrates two CPU read cycles which 
miss the 82495XP cache, and are non-cacheable. In 
the first cycle, the CPU/Cache core forces the read 
to be non-cacheable (as indicated by the 
MCACHE# output from the 82495XP). In the sec- 
ond cycle, non-cacheability of the data is forced by 
the memory bus (as indicated by the MKEN# input 
from the MBC). Since both cycles are not cache- 
able, there is no line-fill operation performed, the cy- 
cles are merely echoed to the memory bus. 

CACHE CONTROL SIGNALS: 

The CPU initiates the first read cycle to the 
82495XP/82490XP cache where the cache tag 
state is looked up. Once the 82495XP determines 
the cycle to be a cache miss, it issues a cycle re- 
quest (CADS# in clock 1) and the associated cycle 
control signals to the MBC (eg. CW/R#, CM/IO#, 
CD/C#, RDYSRC, MCACHE#) in order to schedule 
the read operation. RDYSRC is active, indicating 
that the MBC must provide BRDY# to the CPU; 
MCACHE# is not active, indicating that the read 
miss in not cacheable by the CPU/Cache core. 

The memory bus address (MSET[10;0], 
MTAG[11:0], MCFA[6:0]) is valid with CADS# 
(clocks 1 and 5 for the two cycles in this example) 
and remain valid until after CNA# is sampled active 
by the 82495XP (clocks 4 and 10). MALE and MBA- 
LE may be used to hold the address as necessary. 
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Figure 8-3. Non-Cacheable Read Misses 
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The MBC arbitrates for the memory bus and returns 
BGT# asserted (clock 2), indicating that the cycle is 
guaranteed to complete on the memory bus. Once 
the 82495XP samples BGT# asserted, it must finish 
that cycle on the memory bus. Prior to this point, the 
cycle can be aborted by a snoop hit from another 
cache. 

CNA# is asserted by the MBC (clock3) to indicate 
that it is ready to schedule a new memory bus cycle. 
Note that after CNA# activation, cycle control sig- 
nals are not guaranteed to be valid. 

This cycle has already been determined to be non- 
cacheable; therefore, The MBC does not need to 
assert SWEND#, KWEND#, or MKEN# to the 
82495XP/82490XP cache. The MBC supplies 
BRDY# to the CPU to complete the cycle to the 
CPU. The MBC asserts CRDY (clock 8) to the 
82495XP/82490XP to complete the read miss cycle 
on the memory bus. 

The 82495XP issues a new (unrelated) cycle request 
(CADS# in clock 5) which also misses the 
82495XP/82490XP cache. Since the 82495XP has 
already sampled CNA# asserted, it issues a new 
CADS# prior to receiving CRDY# of the current cy- 
cle (ie. this cycle is pipelined within the MBC). Note 
that once the cycle progress signals of a cycle are 
sampled asserted, the 82495XP ignores them until 
the CRDY# of that cycle. The 82495XP will not 
sample the cycle progress signals again until after 
the CRDY# of the current memory bus cycle. The 
current read cycle is completed on the bus in clock 8 
with CRDY # assertion. 

The cycle progress signals for the second read miss 
are also valid at this time (clock 5). RDYSRC is ac- 
tive, indicating that the MBC must provide BRDY#s 
to the CPU/Cache core; and MCACHE# is active, 
indicating that the read miss is potentially cacheable 
by the 82495XP/82490XP. 

The MBC issues BGT# and CNA# to the 82495XP 
in clock 9 to indicate that the cycle is guaranteed to 
complete on the memory bus, and that it is ready to 
schedule a new memory bus cycle. KWEND# is as- 
serted at this time to close the cacheability window. 
MKEN# is not active, indicating to the 82495XP that 
the read miss cycle is not cacheable by the memory 
bus. KWEND# and MKEN# must be returned to the 
82495XP at least two clocks prior to BRDY# to in- 
form the CPU that a line fill will not follow. 

The MBC asserts SWEND# (clock 11) to close the 
snoop window, and CRDY# (clock 13) to complete 



the cycle to the 82495XP/82490XP. Note: 
SWEND# is not needed since the cycle was not 
cacheable. 

NOTE: 

Both examples show single cycle read requests. 

MEMORY BUS SIGNALS: 

The memory address latch enables (MALE and 
MBALE) may remain asserted by the MBC to place 
the address latches in flow through mode. If the 
82495XP is the current bus master, the memory ad- 
dress output enables (MAOE# and MBAOE#) 
should be asserted by the MBC. The memory data 
output enable (MDOE#) must be inactive to allow 
the data pins to be used as inputs. 

Some time after the address has been driven onto 
the memory bus, data will be supplied from the 
DRAM (main memory) to the 82490XP memory cy- 
cle buffers. 

For Clocked Memory Bus Mode, MEOC# is assert- 
ed by the MBC (clock 6) to latch MZBT# for the 
next transfer, and end the current cycle on the mem- 
ory bus (MBRDY# and MSEL# are not necessary 
since this example shows a single transfer cycle). 
MZBT# is driven high by the MBC in order to force 
the read cycle to begin with a non-zero burst ad- 
dress. 

For the second non-cacheable read cycle, MSEL# 
is driven active by the MBC (clock 8) to allow sam- 
pling of MBRDY# and to latch MZBT# for the trans- 
fer. MZBT# is sampled on all MCLK edges where 
MSEL# is inactive. Once MSEL# is sampled active 
by the 82495XP, the value of MZBT# sampled on 
the prior MCLK is used for the next transfer. Again, 
MZBT# is driven high by the MBC to force the trans- 
fer to begin with the correct burst address. 
MBRDY# is driven active by the MBC in clock 10 to 
cause the memory burst counter to be incremented 
and data to be placed into the 82490XP cache mem- 
ory cycle buffers. The MBC drives MEOC# asserted 
(clock 12) to end the current cycle on the memory 
bus and switch memory cycle buffers for the new 
cycle. 

For Strobed Memory Bus Mode, MEOC# is driven 
active by the MBC (clock 5) to latch MZBT# for the 
transfer (on MEOC# falling edge), and end the cur- 
rent cycle on the memory bus (MISTB is not neces- 
sary since this example shows a single transfer cy- 
cle). MZBT# is driven high by the MBC in order to 
force the read cycle to begin with the correct burst 
address. 
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Figure 8-4. Write Hit to [S] State Line (Write-Through) 



2-336 



HIT 



82495XP Cache Controller/82490XP Cache RAM 



[pKiyiiMM? 



For the second non-cacheable read cycle, MSEL# 
is driven active by the MBC (clock 8) to allow MISTB 
operation and to latch MZBT# for the transfer (on 
MSEL# falling edge). Again, MZBT# is driven high 
by the MBC to force the transfer to begin with the 
correct burst address. MISTB is toggled in clock 9 to 
cause the memory burst counter to be incremented, 
and data to be placed into the 82490XP cache mem- 
ory cycle buffers. Note: MISTB latches the memory 
bus data on both the rising and falling edges. The 
MBC drives MEOC# asserted (clock 13) to end the 
current cycle on the memory bus and switch memo- 
ry cycle buffers for the new cycle. MZBT# for the 
next cycle (not shown), is sampled at this time on 
the falling edge of MEOC#. 



8.2 Write Cycles 

8.2.1 WRITE HITS 

8.2.1.1 Write Hit to [E] or [M] States 

CPU initiated write cycles which hit 82495XP entries 
tagged in the [E] or [M] states are executed com- 
pletely within the CPU/Cache core, and will not be 
seen by the MBC. 

8.2.1.2 Write Hit to [S] State 

Figure 8.4 illustrates CPU initiated write cycles which 
hit lines in the 82495XP/82490XP cache array that 
are in the shared state. If the 82495XP/82490XP is 
used as a write through cache (not write back), the 
[S] state is the only state a cached line could be in. 
These cycles are posted as are all normal write cy- 
cles (as long as no other write miss is pending). 

CACHE CONTROL SIGNALS: 

The CPU initiates the write cycle to the 
82495XP/82490XP cache where the cache tag 
state is looked up. Once the 82495XP determines 
the cycle to be a hit to shared state, it posts the write 
and returns BRDY# to the CPU. 

The 82495XP next issues a cycle request (CADS# 
in clock 1), and the associated cycle control signals 
to the MBC (eg. CW/R#, CM/IO#, CD/C#, 
RDYSRC, MCACHE#, PALLC#) in order to sched- 
ule the write through operation. MCACHE# is not 
active since the write will be posted; RDYSRC is not 
active, indicating that the 82495XP will supply 
BRDY# to the CPU; PALLC# is not active, indicat- 
ing that an allocation cycle will not be performed 



(regardless of MKEN# state) since the line is al- 
ready available in the cache. The MBC must also 
latch PWT and PCD on BLE# falling edge in order 
to track hits and misses to the [S] state. This is how 
an external state tracker can track the [S] state. 

The memory bus address (MSET[10:0], 
MTAG[11:0], MCFA[6:0]) is valid with CADS# 
(clocks 1 and 6 for the two cycles in this example) 
and remains valid until after CNA# is sampled active 
by the 82495XP (clocks 4 and 9). MALE and MBALE 
may be used to hold the address as necessary. 

The MBC arbitrates for the memory bus and returns 
BGT# asserted (clock 2), indicating that the cycle is 
guaranteed to complete on the memory bus. Once 
the 82495XP samples BGT# asserted, it must finish 
that cycle on the memory bus. Prior to this point, the 
cycle can be aborted by a snoop hit from another 
cache. 

CNA# is asserted by the MBC (clock 3) to indicate 
that it is ready to schedule a new memory bus cycle. 
Note that after CNA# activation, cycle control sig- 
nals are not guaranteed to be valid. KWEND# is 
also driven at this time since the cacheability of this 
cycle is already known and MKEN# is a don't care. 
It is not necessary that KWEND# be asserted at this 
time. 

The 82495XP provides BRDY# to the CPU since 
the cycles are posted writes. The MBC completes 
the first write hit to [S] state in clock 5 when it as- 
serts CRDY# to the 82495XP/82490XP cache. The 
data is latched in to the 82490XP array from the 
memory cycle buffer at this time. 

In this example, the 82495XP issues a second write 
to [S] state in clock 6. For this cycle, the 82495XP 
issues the memory bus request (CADS#) as soon 
as it can after sampling CNA# asserted. The 
82495XP will not wait for KWEND# (if it does not 
get asserted immediately as in this example) to is- 
sue CADS# since this is not a potential allocate cy- 
cle (ie. PALLC# active). 

The MBC asserts BGT#, CNA#, and KWEND# to- 
gether in clock 8 to indicate that the current cycle is 
guaranteed to complete and the 82495XP is free to 
schedule a new memory bus cycle. 

Again, the 82495XP provides BRDY# to the CPU 
since the cycles are posted writes. The MBC com- 
pletes the second write hit to [S] state in clock 12 
when it asserts CRDY# to the 82495XP/82490XP 
cache. The data is latched in to the 82490XP array 
from the memory cycle buffer at this time. 
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MEMORY BUS SIGNALS: 

The memory address latch enables (MALE and 
MBALE) may remain asserted by the MBC to place 
the address latches in flow through mode. If the 
82495XP is the current bus master, the memory ad- 
dress output enables (MAOE# and MBAOE#) 
should be asserted by the MBC. 

For Clocked Memory Bus Mode, the memory data 
output enable signal (MDOE#) is asserted by the 
MBC in clock 2 to drive the memory data outputs. 

MEOC# is asserted by the MBC (clock 4) to latch 
MZBT# for the transfer, and end the current cycle 
on the memory bus (MBRDY# is not necessary 
since this example shows a single transfer cycle). 
MZBT# is driven high by the MBC in order to force 
the write cycle to begin with the correct burst ad- 
dress . MFRZ# is sampled here (it need not be ac- 
tive since the cycle is not potentially allocatable). 

For the second write through cycle, MSEL# is driv- 
en active by the MBC (clock 7) to allow sampling of 
MBRDY# and to latch MZBT# for the transfer. 
MZBT# is sampled on all MCLK edges where 
MSEL# is inactive. Once MSEL# is sampled active 
by the 82495XP, the value of MZBT# sampled on 
the prior MCLK is used for the next transfer. Again, 
MZBT# is driven high by the MBC to force the trans- 
fer to begin with the correct burst address. 
MBRDY# is driven active by the MBC in clock 10 to 
cause the memory burst counter to be incremented 
and data to be placed into the 82490XP cache mem- 
ory cycle buffers. The MBC drives MEOC# asserted 
(clock 12) to end the current cycle on the memory 
bus and switch memory cycle buffers for the new 
cycle. 

For Strobed Memory Bus Mode, the memory data 
output enable (MDOE#) is asserted by the MBC in 
clock 2 to drive the memory data outputs. 

MEOC# is driven active by the MBC (clock 4) to 
latch MZBT# for the transfer (on MEOC# falling 
edge), and end the current cycle on the memory bus 
(MOSTB is not necessary since this example shows 
a single transfer cycle). MZBT# is driven high by the 
MBC in order to force the read cycle to begin with 
the correct burst address. 

For the second write through cycle, MSEL# is driv- 
en active by the MBC (clock 6) to allow MOSTB op- 
eration and to latch MZBT# for the transfer (on 
MSEL# falling edge). Again, MZBT# is driven high 
by the MBC to force the transfer to begin with the 



correct burst address. MOSTB is toggled in clock 9 
to cause the memory burst counter to be increment- 
ed, and data to be placed into the 82490XP cache 
memory cycle buffers. Note: MOSTB latches the 
memory bus data on both the rising and falling edg- 
es. The MBC drives MEOC# asserted (clock 11) to 
end the current cycle on the memory bus and switch 
memory cycle buffers for the new cycle. MZBT# for 
the next cycle (not shown), is sampled at this time 
on the falling edge of MEOC#. 

8.2.2 WRITE MISSES 

8.2.2.1 Write Miss with no Allocation 

Figure 8.5 illustrates two CPU initiated write cycles 
which miss the 82495XP/82490XP cache and are 
not allocatable. The first write cycle begins as a po- 
tentially allocatable cycle, but MKEN# sampled in- 
active indicates that the cycle is not cacheable^by 
the memory bus. The second write miss cycle is not 
cacheable by the CPU/82495XP/82490XP as indi- 
cated by the PALLC# output from the 82495XP. 

CACHE CONTROL SIGNALS: 

The CPU initiates the first write cycle to the 
82495XP/82490XP cache where the cache tag 
• state is looked up. Once the 82495XP determines 
the cycle to be a cache miss. It issues a cycle re- 
quest (CADS# in clock 1) and the associated cycle 
control signals to the MBC (eg. CW/R#, CM/IO#, 
CD/C#, RDYSRC, MCACHE#, PALLC#) in order 
to schedule the write miss operation. RDYSRC is not 
active, indicating that the 82495XP will supply 
BRDY# to the CPU; MCACHE# is not active; 
PALLC# is active, indicating that the cycle is poten- 
tially allocatable. 

The write miss data is posted in the 82490XP's 
memory cycle buffer, and the cycle completes with 
no wait states to the CPU. The CPU is then free to 
issue another (non-related) cycle while the 82495XP 
completes the current write miss cycle and possible 
allocation. If this new cycle is a cache hit, it will be 
serviced by the 82495XP immediately; but if it is a 
cache miss, its service will wait until the CRDY# of 
the write cycle (and allocation cycle, if executed). 

The memory bus address (MSET[10:0], 
MTAG[11:0], MCFA[6:0]) is valid with CADS# 
(clocks 1 and 7 for the two cycles in this example) 
and remain valid until, after CNA# is sampled active 
by the 82495XP (clocks 4 and 10). MALE and MBA- 
LE may be used to hold the address as necessary. 
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Figure 8-5. Write Miss with No Allocation 
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The MBC arbitrates for the memory bus and returns 
BGT# asserted (clock 2), indicating that the write 
through cycle is guaranteed to complete on the 
memory bus. Once the 82495XP samples BGT# as- 
serted, it must finish that cycle on the memory bus. 
Prior to this point, the cycle can be aborted by a 
snoop hit from another cache. 

CNA# is asserted by the MBC (clock 3) to indicate 
that it is ready to schedule a new memory bus cycle. 
Notice that the cycle control signals are not guaran- 
teed to be valid after CNA# activation. NOTE that 
CNA# has no effect before KWEND#. 

When the MBC has determined the cacheability at- 
tribute of the write through cycle, it drives the 
MKEN# signal accordingly. The MBC also drives 
the KWEND# signal at this time (clock 4), indicating 
the end of the cacheability window. The 82495XP 
samples MKEN# inactive during KWEND#, indicat- 
ing that the missed eye 
should not be allocated. 

The MBC asserts SWEND# (clock 6) when the 
snoop window of the write through cycle ends on the 
memory bus. The MBC may return CRDY# to the 
82495XP/82490XP cache any time after the closure 
of the snoop window. In this example, CRDY# is 
issued by the MBC in clock 8. 

The 82495XP issues a cycle request for the second 
write miss cycle in clock 7. The cycle control signals 
are valid at this time. Note that PALLC# is inactive, 
indicating that the 82495XP/82490XP has deter- 
mined the cycle to not be allocatable. 

The MBC# asserts BGT#, CNA#, and KWEND# in 
clock 9. MKEN# is a don't care during the cachea- 
bility window since the cycle is not allocatable. The 
snoop window is closed in clock 1 1 , and the cycle is 
completed on the memory bus in clock 13 with the 
assertion of CRDY# by the MBC. 

MEMORY BUS SIGNALS: 

The memory address latch enables (MALE and 
MBALE) may remain asserted by the MBC to place 
the address latches in flow through mode. If the 
82495XP is the current bus master, the memory ad- 
dress output enables (MAOE# and MBAOE#) 
should be asserted by the MBC. 

For Clocked Memory Bus Mode, the memory data 
output enable (MDOE#) is asserted by the MBC in 
clock 4 to drive the memory data outputs. 



MEOC# is asserted by the MBC (clock 5) to latch 
MZBT# for the transfer, and end the current cycle 
on the memory bus (MBRDY# is not necessary 
since this example shows a single transfer cycle). 
MZBT# is driven high by the MBC in order to force 
the read cycle to begin with the correct burst ad- 
dress. MFRZ# is sampled here (it need not be ac- 
tive since the cycle is not potentially allocatable). 

For the second non allocatable write cycle, MSEL# 
is driven active by the MBC (clock 8) to allow sam- 
pling of MBRDY# and to latch MZBT# for the trans- 
fer. MZBT# is sampled on all MCLK edges where 
MSEL# is inactive. Once MSEL# is sampled active 
by the 82495XP, the value of MZBT# sampled on 
the prior MCLK is used for the next transfer. Again, 
MZBT# is driven high by the MBC to force the trans- 
fer to begin with the correct burst address. 
MBRDY# is driven active by the MBC in clock 10 to 
cause the memory burst counter to be incremented 
and data to be piaced into the 82490XP cache mem- 
ory cycle buffers. 

The MBC drives MEOC# asserted (clock 13) to end 
the current cycle on the memory bus and switch 
memory cycle buffers for the new cycle. MFRZ# is 
sampled here (it need not be active since the cycle 
is not potentially allocatable). MZBT# is also sam- 
pled at this time. 

For Strobed Memory Bus Mode, the memory data 
output enable (MDOE#) is asserted by the MBC in 
clock 2 to drive the memory data outputs. 

MEOC# is driven active by the MBC (clock 5) to 
latch MZBT# for the transfer, and end the current 
cycle on the memory bus (MOSTB is not necessary 
since this example shows a single transfer cycle). 
MZBT# is driven high by the MBC in order to force 
the read cycle to begin with the correct burst ad- 
dress. 

For the second write through cycle, MSEL# is driv- 
en active by the MBC (clock 8) to allow MOSTB op- 
eration and to latch MZBT# for the transfer. Again, 
MZBT# is driven high by the MBC to force the trans- 
fer to begin with the correct burst address. MOSTB 
is toggled in clock 12 to cause the memory burst 
counter to be incremented, and data to be read from 
the 82490XP cache memory cycle buffers. Note: 
MOSTB latches the memory bus data on both the 
rising and falling edges. 

The MBC drives MEOC# asserted (clock 13) to end 
the current cycle on the memory bus and switch 
memory cycle buffers for the new cycle. MZBT# 
and MFRZ# for the next cycle (not shown), is sam- 
pled at this time on the failing edge of MEOC#. 



2-340 



Intel. 



82495XP Cache Controller/82490XP Cache RAM 



IPGaiUMIMW 



CLK 

CADS# 

ADDRESS 

CDTS# 

CW/R# 

RDYSRC 

MCACHE# 

PALLC# 

BGT# 

CNA# 

KWEND# 

MKEN# 

SWEND# 

CRDY# 



ry-UT^T-ry-ui^Ln^ij-Ln^^ 



»C 



I WM, 



4 i 5 



7 i 8 i 9 1011i12i13i1415i16i17M8i19i 



i i i 



i i i i i 



T-J-\i/- 



I I I 

ill 



TW 



I I I I I 

I I I I I 



D®cn 



■ ■ i_ 



dce 



I I ALI I I I I I 



DSSMSMSX 



WB « i i i i i i 



TTN4/ ! ! ! I I ! V ! I ! ! ! ! ! 




1 ' I I I will I I ai I I I I uinl I 



I I WM> 



WM ' I I 



I 



CLOCKED MEMORY BUS MODE: 



i i i ■ i i 



i 



i 



"vyt 



I ■ I WB 

-i 1 r 



rw; 



-I I !__L__L_ J 

I I l I l l 



l l l ll 

I l l ll 

l l I l l 

l l l l 



I 



MCLK 
MSEL# 
MEOC# 
MBRDY# 
MZBT# 
MFRZ# 
MDOE# 
MDATA 



i i i I I 

uijTjTJTjijnjijijijiji^ 

iKj i I j j Wrtls 1 i j j m 
■ U i i ; i U— 



i i i 



; I | M | 




WSM/fsw 



i i i i 



I i . i j I i i . 

STROBED MEMORY BUS MODE: !!!!!! 

I . i • i i i ' ' ■ ' 

1 i ' ■ i ' i .i i .. i i i . i i 

i i \i ft : : : : : : : : : ; /a n 



i i i i i 

i i i i i 

i i i i i 

i i i i 



MSEL# 
MEOC# 
MxSTB 
MZBT# 
MFRZ# 
MDOE# 
MDATA 



i i i i.iii,, i i i . i i I i i 



iu-J-J-L-Lx iLxr yr"!--."!-x nci^ct 



■ i i i i i 



Mmmmmzmsm 




Figure 8-6. Write Miss with Allocation to [M] Line 
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8.2.2.2 Write Miss with Allocation 

Figure 8.6 illustrates a CPU initiated write cycle 
which misses the 82495XP/82490XP cache and fol- 
lows the write to main memory with an allocation 
cycle. An allocation is when the cache follows a 
write miss cycle with a line fill. This example as- 
sumes that allocating the new line requires the re- 
placement of a modified line (ie. a write-back to main 
memory). 

CACHE CONTROL SIGNALS: 

The CPU initiates the write cycle to the 
82495XP/82490XP cache where the cache tag 
state is looked up. Once the 82495XP determines 
the cycle to be a cache miss, it issues CADS# 
(clock 1 ) and the associated cycle control signals to 
the MBC (eg. CW/R#, CM/IO#, CD/C#, RDYSRC, 
MCACHE#, PALLC#) in order to schedule the write 
operation. iviCACHE# is not active; RDYSRC is not 
active, indicating that the 82495XP will supply 
BRDY#s to the CPU; PALLC# is asserted, indicat- 
ing a potential allocate cycle after the write-through 
cycle. 

The write miss data is posted in the 82490XP's 
memory cycle buffer, and the cycle completes with 
no wait states to the CPU. The CPU is free to issue 
another (non-related) cycle while waiting for the 
82495XP to complete the allocation. If this new cy- 
cle is a cache hit, it will be serviced by the 82495XP 
immediately; but if it is a cache miss, its service will 
wait until the CRDY# of the allocation. 

The memory bus address (MSET[10:0], 
MTAG[11:0], MCFA[6:0]) is valid with CADS# 
(clocks 1 , 5 and 1 for the three cycles in this exam- 
ple) and remain valid until after CNA# is sampled 
active by the 82495XP (clocks 4, 10 and 15). MALE 
and M BALE may be used to hold the address as 
necessary. 

The MBC arbitrates for the memory bus and returns 
BGT# asserted (clock 2), indicating that the write 
through cycle is guaranteed to complete on the 
memory bus. Once the 82495XP samples BGT# as- 
serted, it must finish that cycle on the memory bus. 
Prior to this point, the cycle can be aborted by a 
snoop hit from another cache. 

CNA# is asserted by the MBC (clock 3) to indicate 
that it is ready to schedule a new memory bus cycle. 
Note that after CNA#, activation, cycle control sig- 
nals are not guaranteed to be valid. 

When the MBC has determined the cacheability at- 
tribute of the write through cycle, it drives the 



MKEN# signal accordingly. The MBC also drives 
the KWEND# signal at this time, indicating the end 
of the cacheability window. The 82495XP samples 
MKEN# active during KWEND# (clock 4), indicat- 
ing that the missed line should be allocated in the 
cache. 

At the first available time (clock 5), the 82495XP as- 
serts CADS# to request an allocation cycle. The cy- 
cle control signals are valid at this point: MCACHE# 
is active, indicating the cacheability of the line-fill cy- 
cle; RDYSRC is not active, indicating that the MBC 
need not supply BRDY#s to the CPU (no BRDY#s 
are necessary for an allocation cycle). 

The MBC asserts SWEND# (clock 6) when the 
snoop window of the write through cycle ends on the 
memory bus. 

The MBC may return CRDY# to the 82495XP/ 
82490XP cache anyjime after the closure of the 
snoop window. In this example, CRDY# is issued by 
the MBC in clock 8. At this time, the cycle progress 
signals for the allocation cycle may be issued by the 
MBC to complete the line fill. 

Once again, the MBC arbitrates for the memory bus 
and returns BGT# asserted (clock 9) for the alloca- 
tion cycle. The MBC also asserts CNA# and 
KWEND# at this time. The 82495XP back-invali- 
dates the CPU to maintain first and second level 
cache consistency. 

In clock 10, the 82495XP asserts CADS# for the 
write back cycle (since the miss was to a dirty line). 
CDTS# is asserted by the 82495XP two clocks later 
(clock 12). Note that CDTS# of the write back cycle 
is not asserted with CADS# since the data is not yet 
available in the 82490XP's write-back buffer. 

The MBC asserts SWEND# (clock 11) when the 
snoop window of the allocation cycle end on the 
memory bus. 

At this time, the MBC may assert CRDY# to the 
82495XP/82490XP cache for the allocation cycle. 
CRDY# assertion will cause the data stored in the 
82490XP's memory cycle buffers to be latched into 
the cache array. 

On the memory bus, BGT#, CNA#, and KWEND# 
are sampled active in clock 14 for the write back- 
cycle. The snoop window is closed two clocks later 
(clock 16) by the MBC with SWEND#, and the write 
back cycle is completed with CRDY# asserted in 
clock 18. 
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MEMORY BUS SIGNALS: 

The memory address latch enables (MALE and 
MBALE) may remain asserted by the MBC to place 
the address latches in flow through mode. If the 
82495XP is the current bus master, the memory ad- 
dress output enables (MAOE# and MBAOE#) 
should be asserted by the MBC. 

For Clocked Memory Bus Mode, the memory data 
output enable (MDOE#) has been asserted by the 
MBC to drive the memory data outputs. 

MEOC# is asserted by the MBC (clock 4) to latch 
MZBT# for the transfer, and end the current cycle 
on the memory bus (MBRDY# is not necessary 
since this example shows a single transfer write 
miss cycle). MZBT# is driven high by the MBC in 
order to force the read cycle to begin with the cor- 
rect burst address. MFRZ# is driven inactive by the 
MBC here, allowing the line to be placed into the 
exclusive ([E]) state and requiring the data to be 
written to main memory. 

For the allocation (line fill) cycle, MSEL# is driven 
active by the MBC (clock 6) to allow sampling of 
MBRDY# and to latch MZBT# for the transfer. 
MZBT# is sampled on all MCLK edges where 
MSEL# is inactive. Once MSEL# is sampled active 
by the 82495XP, the value of MZBT# sampled on 
the prior MCLK is used for the next transfer. 
MDOE# is also deasserted in clock 6 to allow the 
data pins to be used as inputs for the allocation cy- 
cle. 

MBRDY# is driven active by the MBC in clocks 7 to 
9 to cause the memory burst counter to be incre- 
mented and data to be placed into the 82490XP 
cache memory cycle buffers. The MBC drives 
MEOC# asserted (clock 10) to end the allocation 
cycle on the memory bus and switch memory cycle 
buffers for the new cycle. MZBT# is sampled and 
latched at this time for the next data transfer. 

MDOE# is asserted by the MBC (clock 12) to drive 
the memory data outputs for the write back cycle. 

The MBC again asserts MBRDY# (clocks 13 to 15) 
for the write back cycle to increment the memory 
burst counter and cause data to be read from the 
82490XP memory cycle buffers. The write back cy- 
cle ends on the memory bus and switches memory 
cycle buffers with MEOC# assertion (clock 16). 
MZBT# and MFRZ# for the next transfer are sam- 
pled at this time. MFRZ# need not be active since 
the cycle is not potentially allocatable. 

For Strobed Memory Bus Mode, the memory data 
output enable (MDOE#) has been asserted by the 
MBC to drive the memory data outputs for the write 
miss cycle. 



MEOC# is driven active by the MBC (clock 4) to 
latch MZBT# for the transfer, and end the current 
cycle on the memory bus (MOSTB is not necessary 
since this example shows a single transfer cycle). 
MZBT# is driven high by the MBC in order to force 
the read cycle to begin with the correct burst ad- 
dress. MFRZ# is driven deasserted by the MBC 
here, allowing the line to be placed into the exclu- 
sive ([E]) state. 

For the allocation (line fill) cycle, MSEL# is driven 
active by the MBC (clock 6) to allow MISTB opera- 
tion and to latch MZBT# for the transfer. MISTB is 
toggled in clocks 8 to 10 to cause the memory burst 
counter to be incremented, and data to be placed 
into the 82490XP cache memory cycle buffers. 
Note: MISTB latches the memory bus data on both 
the rising and falling edges. MDOE# is also deas- 
serted in clock 6 to allow the data pins to be used as 
inputs for the allocation cycle. 

The MBC drives MEOC# asserted (clock 1 1) to end 
the allocation cycle on the memory bus and switch 
memory cycle buffers for the new cycle. MZBT# for 
the next cycle, is latched at this time on the falling 
edge of MEOC#. 

MDOE# is asserted by the MBC (clock 14) to drive 
the memory data outputs for the write back cycle. 

The MBC toggles MOSTB (clocks 15 to 1.7) for the 
write back cycle to increment the memory burst 
counter and cause data to be read from the 
82490XP memory cycle buffers. 

The write back cycle ends on the memory bus and 
switches memory cycle buffers with MEOC# asser- 
tion (clock 18). MZBT# and MFRZ# for the next 
transfer are sampled at this time. MFRZ# need not 
be active since the cycle is not potentially allocata- 
ble. 



8.3 Snooping Cycles 

8.3.1 SYNCHRONOUS SNOOPING MODE 
(HIT TO [M] LINE) 

Figure 8.7 illustrates a snoop hit to a dirty line se- 
quence occurring simultaneously with a CPU initiat- 
ed read miss cycle. This example assumes synchro- 
nous snooping mode (ie. requests for snoops are 
done via SNPSTB# from the MBC, sampled on the 
82495XP's CLK). 
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CACHE CONTROL SIGNALS: 



MEMORY BUS SIGNALS: 



In clock 1 SNPSTB# is asserted by the MBC, indi- 
cating to the 82495XP a request for snooping. The 
82495XP samples MAOE# (it must be inactive) in 
order to recognize the snoop request. It is latched 
together with the snoop address (MSET[0:10], 
MTAG[0:11], MCFA[0:6]), SNPINV, MBAOE#, and 
SNPNCA on the 82495XP's CLK during SNPSTB# 
assertion. The tag look-up is done immediately after 
SNPSTB# is sampled active since snoop opera- 
tions have the highest priority in the cache tag state 
arbiter. The 82495XP issues SNPCYC# (clock 2), 
indicating that the snoop look-up is in progress. The 
results of the look-up are driven to the memory bus 
via MTHIT# and MHITM# in the next clock after 
SNPCYC#. Since the snoop hit a modified line, both 
signals are asserted (clock 3). SNPBSY# is also is- 
sued to indicate that the 82495XP is busy with CPU 
back-invalidations, the 82490XP's snoop buffer is 
full, or a write back is to follow. The 82495XP will 
accept snoops only when SNPBSY# is inactive. 

Simultaneously with the memory bus activity due to 
the snoop request, the CPU initiates a read miss cy- 
cle. The 82495XP issues a memory bus request 
(CADS#), CDTS#, and cycle control signals to the 
MBC in clock 3. The MBC must wait for the pending 
snoop cycle to complete on the memory bus prior to 
servicing this read miss cycle. 

The memory bus address (MSET[10:0], 
MTAG[11:0], MCFA[6:0]) is not valid until MAOE# 
goes active after CRDY# of the snoop write back 
cycle is sampled active by the 82495XP and the 
CADS# is reissued (clock 13). 

In clock 4 the 82495XP issues SNPADS# and cycle 
control signals to the MBC, indicating a request to 
flush a modified line out of the cache. SNPADS# 
activation causes the MBC to abort the pending read 
miss cycle. It is the 82495XP responsibility to re-is- 
sue the aborted cycle after the completion of the 
write back, since BGT# was not asserted by the 
MBC. 

Data is loaded into the 82490XP's snoop buffer. 
Since SNPINV was sampled asserted by the 
82495XP (clock 1) during SNPSTB# assertion, it 
back-invalidated the CPUs first level cache. 

The 82495XP asserts CDTS# (clock 8) indicating to 
the MBC that data is available in the snoop buffer. 
When the MBC complete the write back cycle on the 
memory bus, it activates CRDY# -to the 
82495XP/82490XP cache. At this time, the 
82495XP deasserts SNPBSY# (clock 13) and re-is- 
sues the aborted read miss cycle (clock 13) by as- 
serting CADS# and CDTS#. 



For Clocked Memory Bus Mode, the memory data 
output enable (MDOE#) is not activated by the MBC 
to allow the memory data pins to be used as inputs. 

MSEL# is driven active by the MBC (clock 4) to al- 
low sampling of MBRDY# and to latch MZBT# for 
the read miss transfer. MZBT# is sampled on all 
MCLK rising edges where MSEL# is inactive. Once 
MSEL# is sampled active by the 82495XP, the val- 
ue of MZBT# sampled on the prior MCLK is used 
for the next transfer. 

Since the read miss cycle is aborted due to the 
snoop hit to a modified line (requires a write back 
cycle), no MEOC# is given. MSEL# is deasserted 
by the MBC (clock 6) and reasserted (clock 8) to 
allow latching of MZBT# for the snoop write back 
cycle and sampling of MBRDY# for that cycle. 
MFRZ# is also sampled at this time. 

The memory data output enable (MDOE#) signal is 
driven active by the MBC (clock 7) to drive the mem- 
ory data outputs. 

MBRDY# is driven active by the MBC in clocks 10 
to 1 2 to cause the memory burst counter to be incre- 
mented and data to be written from the 82490XP 
cache snoop buffers. The MBC drives MEOC# as- 
serted (clock 1 3) to end the write back cycle on the 
memory bus and switch memory cycle buffers for 
the new cycle. MZBT# and MFRZ# are sampled 
and latched at this time for the next data transfer. 

MDOE# is deasserted by the MBC (clock 14) to al- 
low the memory data pins to be used as inputs for 
the reissued read cycle. 

For Strobed Memory Bus Mode, the memory data 
output enable (MDOE#) has not been asserted by 
the MBC to allow the memory data pins to be used 
as inputs for the read miss cycle. 

MSEL# is asserted by the MBC (clock 4) to allow 
sampling of MISTB and latch MZBT# (on the falling 
edge of MSEL#) for the read miss transfer. 

Since the read miss cycle is aborted due to the 
snoop hit to a modified line (requires a write back 
cycle), no MEOC# is given. MSEL# is deasserted 
by the MBC (clock 5) and reasserted (clock 6) to 
allow latching of MZBT# for the snoop write back 
cycle and sampling of MOSTB for that cycle. 
MFRZ# is also sampled at this time. 

MOSTB is toggled in clocks 11 to 13 to cause the 
memory burst counter to be incremented, and data 
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to be read from the 82490XP cache memory cycle 
buffers. Note: MOSTB latches the memory bus data 
on both the rising and falling edges. The MBC drives 
MEOC# asserted (clock 14) to end the snoop write 
back cycle on the memory bus and switch memory 
cycle buffers for the new cycle. MZBT# and 
MFRZ# for the next cycle, are latched at this time 
on the falling edge of MEOC#. 

MDOE# is deasserted by the MBC (clock 14) to al- 
low the memory data pins to be used as inputs for 
the reissued read miss cycle. 

8.3.2 CLOCKED SNOOPING MODE 

Figure 8.8 illustrates a CPU initiated Read cycle 
which misses the 82495XP/82490XP cache and the 
subsequent line fill replaces non dirty data (eg. clean 
or empty). Simultaneous with the read request to the 
MBC, that device initiates a snoop to the 82495XP 
which misses that line in the cache. The snoop is the 
result of a write cycle on the memory bus by some 
other cache core; therefore, asserting the snoop in- 
validation signal (SNPINV) to this 82495XP. this ex- 
ample assumes Clocked Snooping Mode (i.e. the re- 
quests for snoops are done via SNPSTB# from the 
MBC, sampled on the MBC's SNPCLK). 

CACHE CONTROL SIGNALS: 

The CPU initiates the read cycle to the 
82495XP/82490XP cache where the cache tag 
state is looked up. Once the 82495XP determines 
the cycle to be a cache miss, it issues CADS# 
(clock 1 ) and the associated cycle control signals to 
the MBC (eg. CW/R#, CM/IO#, CD/C#, RDYSRC, 
MCACHE#) in order to schedule the cache line-fill 
operation. MCACHE# is active, indicating that the 
read miss in potentially cacheable by the 82495XP; 
RDYSRC is active, indicating that the MBC must 
supply BRDY#s to the CPU cache core. 

In clock 3, SNPSTB# is asserted by the MBC at this 
time, indicating to the 82495XP a request for snoop- 
ing. MAOE# is deasserted to allow the forthcoming 
snoop (the 82495XP will not recognize the snoop if 
MAOE# is active). It is latched together with the 
snoop address (MSET[0:10], MTAG[0:11], 
MCFA[0:6]), SNPINV, MBAOE#, and SNPNCA on 
the MBC's SNPCLK rising edge during SNPSTB# 
assertion. SNPINV is asserted from the MBC since 
the cache core which initiated the snoop issued a 
write cycle on the memory bus. If the response of 
the snoop to this 82495XP was a cache hit, the con- 
tents would no longer be valid due that write. 



Following synchronization to the 82495XP CLK, it 
issues SNPCYC# (clock 5), indicating that the 
snoop look-up is in progress. The results of the look- 
up are driven to the memory bus via MTHIT# and 
MHITM# in the next clock after SNPCYC#. Since 
the snoop was a miss in the cache, both signals are 
inactive (clock 6). Note that SNPBSY# will not be 
asserted since the snoop was a miss to this cache. 
The snoop from another cache is complete at this 
point, and the read miss cycle will continue. 

The MBC asserts MAOE# to allow this 82495XP to 
drive its address on the memory bus in order to com- 
plete the read miss cycle. The memory bus address 
(MSET[10:0], MTAG[11:0], MCFA[6:0]) is valid after 
MAOE# assertion # (clock 6 for the read cycle in 
this example) and remains valid until after CNA# is 
sampled active by the 82495XP (clock 8). MALE and 
MBALE may be used to hold the address as neces- 
sary. 

The MBC arbitrates for the memory bus and returns 
BGT# asserted (clock 6), indicating that the cycle is 
guaranteed to complete on the memory bus. Once 
the 82495XP samples BGT# asserted, it must finish 
that cycle on the memory bus. Prior to this point, the 
cycle can be aborted by a snoop hit from another 
cache. 

CNA# is asserted by the MBC (clock 7) to indicate 
that it is ready to schedule a new memory bus cycle. 
Note that after CNA# activation, cycle control sig- 
nals are not guaranteed to be valid. 

When the MBC has determined the cacheability at- 
tribute of the cycle, it drives the MKEN# signal ac- 
cordingly. The MBC also drives the KWEND# signal 
at this time, indicating the end of the cacheability 
window. The 82495XP samples MKEN# during 
KWEND# (clock 7) to determine that the cycle is 
indeed cacheable. 

The MBC asserts SWEND# when the snoop win- 
dow ends on the memory bus. The 82495XP sam- 
ples MWB/WT# during SWEND# (clock 9) and up- 
dates the cache tag state according to the consist- 
ency protocol. The closure of the snoop window also 
enables the MBC to start providing the CPU with 
data that has been stored in the 82490XP's memory 
cycle buffer. The MBC supplies BRDY#s to the CPU 
(clocks 9-12). 

The read miss cycle ends when CRDY# is driven 
active by the MBC (clock 12). It is at this time that 
the data in the 82490XP's memory cycle buffers is 
loaded into the cache SRAM. 



2-346 



82495XP Cache Controller/82490XP Cache RAM 



PG&orc»» 



i 1 ; 2 i 3 i 4 i 5 i 
i i i i 



CLK 

CADS# 

ADDRESS 

CW/R# 

RDYSRC 

MCACHE# 

BGT# 

CNA# 

KWEND# 

MKEN# 

SWEND# 

BRDY# 

CRDY# 

SNPCYC# 



CLOCKED 

SNPCLK 

SNPSTB# 

SNPINV 

MAOE# 



n-fijijnjijTjnjTTunjiTLn-i 

j-\i/t 



i i 
i ~~r i i i i 



| Snoop Address j Si 



m 






■ i ■ 
i i i 

■ i i 



^xx^m^X)ML 



taffi$»$$$r 



mmmmsamw 




wmwwwwm 



8 i 9 i 10 i 11 i 12 i 13 i 
i i 



-J I I I I 



i i i i i i 
i i i i i 



i imwmwxMcm 



^nvn 






i i i i i i 



/MWWowtei 



i i i 



m mm mm 



■ IV i. fi ■ I I I I I I I I 

' I i ' i i i i i i i i i 



> \. I I l_vl I I I I I I I I 



K 



I '/ i i 



J ! L. 



1 ' ' i i i i i i i i 
1 ■< i i i i i i i i 



CLOCKED 
MCLK 
MSEL# 
MEOC# 
MBRDY# 
MZBT# 
MDATA 



MEMORY BUS MODE: , 



i i i i i i i 
i i i i i i i 
ii i i i i i i 



■ I ' ' ' ' i i i i i i i i 




,..,. - , . ( , 


— t — i 1 1 r 1 1 


STROBED MEMORY BUS MODE: | 
i , i i i i 


i i i i i i i 
i i i i i i i 

1 i i i : /t-i 


MSEL# ! | ! i ! i v. 


1 1 1 1 1 1 


i i i i i i i 


meoc# | | . | | i ; 


; ; ; ; \u ; ; 

i '....J ,.L. ' 1 ,,J 


MxSTB \m^ 


popjpo^ 


i k. yi y« . mm 






r i r i i ■■i"" n i 


mzbt# Ikxx>wxx)6<>oo6o6o6ooo6( ^ 





240956-40 



Figure 8-8. Clocked Snooping Mode 
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MEMORY BUS SIGNALS: 



CACHE CONTROL SIGNALS: 



The memory address latch enables (MALE and 
MBALE) may remain asserted by the MBC to place 
the address latches in flow through mode. If the 
82495XP is the current bus master, the memory ad- 
dress output enables (MAOE# and MBAOE#) 
should be asserted by the MBC. (Note the use of 
MAOE# for snooping at the beginning of the cache 
control signals section.) MDOE# must be inactive to 
allow the data pins to be used as inputs. 

Some time after the address has been driven onto 
the memory bus, data will be supplied from the 
DRAM (main memory) to the 82490XP cache 
SRAM. 

For Clocked Memory Bus Mode, MSEL# is driven 
active by the MBC (clock 6) to allow sampling of 
MBRDY# and to latch MZBT# for the transfer. 
MZBT# is sampled on aii MCLK edges where 
MSEL# is inactive. Once MSEL# is sampled active 
by the 82495XP, the value of MZBT# sampled on 
the prior MCLK is used for the next transfer. 
MBRDY# is driven active by the MBC in clocks 7 to 
9 to cause the memory burst counter to be incre- 
mented and data to be placed into the 82490XP 
cache memory cycle buffers. The MBC drives 
MEOC# asserted (clock 10) to end the current cycle 
on the memory bus and switch memory cycle buffers 
for the new cycle. MZBT# is sampled at this time 
(when MEOC# is sampled asserted and MSEL# re- 
mains low) for the next transfer. 

For Strobed Memory Bus Mode, MSEL# is driven 
active by the MBC (clock 6) to allow MISTB opera- 
tion and to latch MZBT# (on the falling edge of 
MSEL#) for the transfer. MISTB is toggled in clocks 
8 to 10 to cause the memory burst counter to be 
incremented, and data to be placed into the 
82490XP cache memory cycle buffers. Note: MISTB 
latches the memory bus data on both the rising and 
falling edges. The MBC drives MEOC# asserted 
(clock 11) to end the current cycle on the memory 
bus and switch memory cycle buffers for the new 
cycle. MZBT# for the next cycle, is sampled at this 
time on the falling edge of MEOC#, 

8.3.3 STROBED SNOOPING MODE 
(HIT TO [M] LINE) 

Figure 8.9 illustrates a snoop hit to a dirty line se- 
quence occurring simultaneously with a CPU initiat- 
ed read miss cycle. This example assumes strobed 
snooping mode (ie. requests for snoops are done 
from the falling edge of SNPSTB#). 



In clock 1 (totally asynchronous to any clock) 
SNPSTB# is asserted by the MBC, indicating to the 
82495XP a request for snooping. The 82495XP 
samples MAOE# (it must be inactive) in order to 
recognize the snoop request. It is latched together 
with the snoop address (MSET[0:10], MTAG[0:11], 
MCFA[0:6]), SNPINV, MBAOE#, and SNPNCA on 
falling edge of SNPSTB#. The 82495XP issues 
SNPCYC# (clock 3), indicating that the snoop look- 
up is in progress. The results of the look-up are driv- 
en to the memory bus via MTHIT# and MHITM# in 
the next clock after SNPCYC#. Since the snoop hit 
a modified line, both signals are asserted (clock 4). 
SNPBSY# is also issued to indicate that the 
82495XP is busy with CPU back-invalidations, the 
82490XP's snoop buffer is full, or a write back is to 
follow. The 82495XP will accept snoops only when 
SNPBSY# is inactive. 

Simultaneously with the memory bus activity due to 
the snoop request, the CPU initiates a read miss cy- 
cle. The 82495XP issues a memory bus request 
(CADS#), CDTS#, and cycle control signals to the 
MBC in clock 1. The MBC must wait for the pending 
snoop cycle to complete on the memory bus prior to 
servicing this read miss cycle. 

The memory bus address (MSET[10:0], 
MTAG[11:0], MCFA[6:0]) is not valid until MAOE# 
goes active after CRDY# of the snoop write back 
cycle is sampled active by the 82495XP and the 
CADS# is reissued (clock 15). 

In clock 5 the 82495XP issues SNPADS# and cycle 
control signals to the MBC, indicating a request to 
flush a modified line out of the cache. SNPADS# 
activation causes the MBC to abort the pending read 
miss cycle. It is the 82495XP responsibility to re-is- 
sue the aborted cycle after the completion of the 
write back, since BGT# was not asserted by the 
MBC. 

Data is loaded into the 82490XP's snoop buffer. 
Since SNPINV was sampled asserted by the 
82495XP (clock 1) during SNPSTB# assertion, it 
back-invalidated the CPUs first level cache. 

The 82495XP asserts CDTS# (clock 9) indicating to 
the MBC that data is available in the snoop buffer. 
When the MBC complete the write back cycle on the 
memory bus, it activates CRDY# to the 
82495XP/82490XP cache. At this time, the 
82495XP deasserts SNPBSY# (clock 15) and re-is- 
sues the aborted read miss cycle by asserting 
CADS# and CDTS#. 
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MEMORY BUS SIGNALS: 

For Clocked Memory Bus Mode, the memory data 
output enable (MDOE#) is not activated by the MBC 
to allow the memory data pins to be used as inputs. 

MSEL# is driven active by the MBC (clock 2) to al- 
low sampling of MBRDY# and to latch MZBT# for 
the read miss transfer. MZBT# is sampled on all 
MCLK rising edges where MSEL# is inactive. Once 
MSEL# is sampled active by the 82495XP, the val- 
ue of MZBT# sampled on the prior MCLK is used 
for the next transfer. 

Since the read miss cycle is aborted due to the 
snoop hit to a modified line (requires a write back 
cycle), no MEOC# is given. MSEL# is deasserted 
by the MBC (clock 9) and reasserted (clock 11) to 
allow latching of MZBT# for the snoop write back 
cycle and sampling of MBRDY# for that cycle. 
MFRZ# is also sampled at this time. 

The memory data output enable (MDOE#) signal is 
driven active by the MBC (clock 9) to drive the mem- 
ory data outputs. 

MBRDY# is driven active by the MBC in clocks 11 
to 13 to cause the memory burst counter to be incre- 
mented and data to be written from the 82490XP 
cache memory cycle buffers. The MBC drives 
MEOC# asserted (clock 14) to end the write back 
cycle on the memory bus and switch memory cycle 
buffers for the new cycle. MZBT# and MFRZ# are 
sampled and sampled at this time for the next data 
transfer. 

MDOE# is deasserted by the MBC (clock 16) to al- 
low the memory data pins to be used as inputs for 
the reissued read cycle. 

For Strobed Memory Bus Mode, the memory data 
output enable (MDOE#) has not been asserted by 
the MBC to allow the memory data pins to be used 
as inputs for the read miss cycle. 

MSEL# is asserted by the MBC (clock 2) to allow 
sampling of MISTB and latch MZBT# (on the falling 
edge of MSEL#) for the read miss transfer. 

Since the read miss cycle is aborted due to the 
snoop hit to a modified line (requires a write back 
cycle), no MEOC# is given. MSEL# is deasserted 
by the MBC (clock 9) and reasserted (clock 11) to 
allow latching of MZBT# for the snoop write back 
cycle and sampling of MOSTB for that cycle. 
MFRZ# is also sampled at this time. 

MOSTB is toggled in clocks 12 to 14 to cause the 
memory burst counter to be incremented, and data 



to be read from the 82490XP cache memory cycle 
buffers. Note: MOSTB latches the memory bus data 
on both the rising and falling edges. 

The MBC drives MEOC# asserted (clock 15) to end 
the snoop write back cycle on the memory bus and 
switch memory cycle buffers for the new cycle. 
MZBT# and MFRZ# for the next cycle, are sam- 
pled at this time on the falling edge of MEOC#. 

MDOE# is deasserted by the MBC (clock 16) to al- 
low the memory data pins to be used as inputs for 
the reissued read miss cycle. 

8.3.4 CACHE TO CACHE TRANSFER 

8.3.4.1 Read Cycles Causing a Snoop Hit 
to [M] Line 

Figure 8.10 illustrates CPU initiated Read cycles that 
miss the 82495XP/82490XP cache and replace a 
non-dirty (eg. clean) line in the cache. During the 
snoop window, the memory bus attribute which indi- 
cates a direct to [M] state transfer is sampled active. 
In such cycles, the 82495XP will instruct the MBC to 
perform a cache line-fill cycle on the memory bus. 
The request for data will not go to main memory, but 
instead will go to the controller of the cache which 
contained the modified data. The line is then written 
into the 82490XP's array, and data transferred to the 
CPU as requested. If the line fetched from the sec- 
ond cache replaces a line which is in valid unmodi- 
fied state ([E] or [S]), then a back-invalidation cycle 
is performed on the CPU bus to guarantee that the 
replaced data is also removed from the CPU's first 
level cache, thus maintaining the inclusion property. 

CACHE CONTROL SIGNALS: 

The CPU initiates the read cycle to the 
82495XP/82490XP cache where the cache tag 
state is looked up. Once the 82495XP determines 
the cycle to be a cache miss, it issues CADS# 
(clock 2) and the associated cycle control signals to 
the MBC (eg. CW/R#, CM/IO#, CD/C#, RDYSRC, 
MCACHE#) in order to schedule the cache line-fill 
operation. MCACHE# is active, indicating that the 
read miss is potentially cacheable by the 82495XP; 
RDYSRC is active, indicating that the MBC must 
supply BRDY#s to the CPU cache core. 

The memory bus address (MSET[10:0], 
MTAG[11:0], MCFA[6:0]) is valid with CADS# 
(clocks 2 and 13 for the two read miss cycles in this 
example) and remain valid until after CNA# is sam- 
pled active by the 82495XP (clocks 5 and 1 6). MALE 
and MBALE may be used to hold the address as 
necessary. 
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Figure 8-10. Cache to Cache Transfer: Cacheable Read Miss 



2-351 



iriflel. 



82495XP Cache Controller/82490XP Cache RAM 



PftKUMOGMW 



The MBC arbitrates for the memory bus and returns 
BGT# asserted (clock 3), Indicating that the cycle is 
guaranteed to complete on the memory bus. Once 
the 82495XP samples BGT# asserted, it must finish 
that cycle on the memory bus. Prior to this point, the 
cycle can be aborted by a snoop hit from another 
cache. 

CNA# is asserted by the MBC (clock 4) to indicate 
that it is ready to schedule a new memory bus cycle. 
Note that after CNA# activation, cycle control sig- 
nals are not guaranteed to be valid. 

When the MBC has determined the cacheability at- 
tribute of the cycle, it drives the MKEN# signal ac- 
cordingly. The MBC also drives the KWEND# signal 
at this time, indicating the end of the cacheability 
window. The 82495XP samples MKEN# and 
MRO# during KWEND# (clock 5) to determine that 
the cycle is indeed cacheable. 

The MBC asserts SWEND# when the snoop win- 
dow ends on the memory bus. The 82495XP sam- 
ples MWB/WT# and DRCTM# during SWEND# 
(clock 7) and updates the cache tag state according 
to the consistency protocol. Since the result of the 
snoop was a hit to a modified line in another cache, 
the MBC asserts DRCTM# at this time (this is an 
option to save time by skipping the main memory 
access, not a requirement of the memory bus) so 
that the tag state will go immediately to the [M] 
state, skipping the [E] state. MWB/WT# must be in 
write back mode (high) to assure this transition. The 
closure of the snoop window also enables the MBC 
to start providing the GPU with data that has been 
stored in the 82490XP's memory cycle buffer. The 
MBC supplies BRDY#s to the CPU (clocks 7-10). 

The 82495XP issues a new CADS# in clock 13, 
which also misses the 82495XP/82490XP cache. 
Since the 82495XP has already sampled CNA# as- 
serted (clock 4), It issues a new CADS# prior to 
receiving CRDY# of the current cycle (ie. this cycle 
is pipelined within the MBC). Note that once the cy- 
cle progress signals (BGT#, CNA#, KWEND#, 
SWEND#) of a cycle are sampled asserted, the 
82495XP ignores them until the CRDY# of that cy- 
cle. The 82495XP does not pipeline the cycle prog- 
ress signals (ie. it will not sample them again until 
after CRDY# of the current memory bus cycle). 

MEMORY BUS CYCLES: . 

At the start of this cycle, the master 82495XP does 
not know that the data will be coming from a slave 
82495XP/82490XP and begins a read request to 
main memory to obtain the required data. Since the 



snoop resulted in a hit to a modified line in the sec- 
ond cache, the memory request must be backed off 
so that the snooped 82495XP may supply the data. 

The memory address latch enables (MALE and 
MBALE) may remain asserted by the MBC to place 
the address latches in flow through mode. If the 
82495XP is the current bus master, the memory ad- 
dress output enables (MAOE# and MBAOE#) 
should be asserted by the MBC. The memory data 
output enable signal (MDOE#) must remain inactive 
to allow the data pins to be used as inputs. 

For Clocked Memory Bus Mode, MSEL# is driven 
active by the MBC (clock 4) to allow sampling of 
MBRDY# and to latch MZBT# for the transfer. 
MZBT# is sampled on all MCLK edges where 
MSEL# is inactive. Once MSEL# is sampled active 
by the 82495XP, the value of MZBT# sampled on 
the prior MCLK is used for the next transfer. 

MBRDY# is driven active in clocks 4 to 10 to read 
data into the 82490XP cache memory cycle buffers. 
The MBC asserts MEOC# (clock 11) to end the 
read miss cycle on the memory bus and switch the 
memory cycle buffers for a new cycle. MZBT# is 
latched at this time for the next transfer. Note that 
there are 8 transfers needed to fill the 
82495XP/82490XP cache line and only 4 needed 
for the CPU line fill. 

MBRDY# is again driven active by the MBC in 
clocks 11 to 21 to cause the memory burst counter 
to be incremented and data to be placed into the 
82490XP cache memory cycle buffers for the sec- 
ond read miss cycle. 

For Strobed Memory Bus Mode, MSEL# is driven 
active by the MBC (clock 4) to allow MISTB opera- 
tion and to latch MZBT# for the transfer (on the 
falling edge of MSEL#). MISTB is toggled in clocks 
5 to 11 to cause the memory burst counter to be 
incremented, and data to be placed into the 
82490XP cache memory cycle buffers. Note: MISTB 
latches the memory bus data on both the rising and 
falling edges. The MBC drives MEOC# asserted 
(clock 12) to end the current cycle on the memory 
bus and switch memory cycle buffers for the new 
cycle. MZBT# for the next cycle is latched at this 
time on the falling edge of MEOC#. 

The MBC toggles MISTB (clocks 16 to 21) for the 
second read miss cycle to increment the memory 
burst counter and cause data to be written into the 
82490XP memory cycle buffers. 
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Figure 8-11. Read For Ownership 



2-353 



82495XP Cache Controller/82490XP Cache RAM 



[p[f&0[M«f 



8.4 Read for Ownership 



8.4.1 



WRITE MISS WITH MFRZ# ASSERTED, 
FOLLOWED BY READ TO SAME LINE 



Figure 8-1 1 illustrates a Read For Ownership cycle. 
First, a CPU initiates a write cycle which misses the 
82495XP/82490XP cache. The MBC issues a "dum- 
my" write to main memory (the write does not actu- 
ally go out to main memory - to save valuable bus 
time). The 82490XP MFRZ# input is used by the 
MBC to indicate that the following line-fill (allocation) 
data (from either main memory or another cache) 
should be merged with the data of the write miss. 
The entire line is then placed into the internal ta- 
gram. 

CACHE CONTROL SIGNALS: 

The CPU initiates a write cycle to the 
82495XP/82490XP cache where the cache tag 
state is looked up. Once the 82495XP determines 
the cycle to be a cache miss, it issues CADS# 
(clock 1) and the associated cycle control signals to 
the MBC (eg. CW/R#, CM/IO#, CD/C#, RDYSRC, 
MCACHE#, PALLC#) in order to schedule the write 
operation. MCACHE# is not active; RDYSRC is not 
active, indicating that the 82495XP will supply 
BRDY#s to the CPU; PALLC# is active, indicating a 
potential allocate cycle after the write through cycle. 

The write miss data is posted in the 82490XP's 
memory cycle buffer, and the cycle completes with 
no wait states to the CPU. The CPU is free to issue 
another (non-related) cycle while the 82495XP is 
processing the allocation. If this new cycle is a 
cache hit, it will be serviced by the 82495XP immedi- 
ately; but if it is a cache miss, its service will wait 
until the CRDY# of the allocation. 

The memory bus address (MSET[10:0], 
MTAG[11:0], MCFA[6:0]) is valid with CADS# 
(clocks 1 and 5 for the write miss and allocation cy- 
cle in this example) and remain valid until after 
CNA# is sampled active by the 82495XP (clocks 4 
and 10). MALE and MBALE may be used to hold the 
address as necessary. 

The MBC arbitrates for the memory bus and returns 
BGT# asserted (clock 2), indicating that the write 
through cycle is guaranteed to complete on the 
memory bus. Once the 82495XP samples BGT# as- 
serted, it must finish that cycle on the memory bus. 
Prior to this point, the cycle can be aborted by a 
snoop hit from another cache. 

CNA# is asserted by the MBC (clock 3) to indicate 
that it is ready to schedule a new memory bus cycle. 
Note that after CNA# activation, cycle control sig- 
nals are not guaranteed to be valid. 



When the MBC has determined the cacheability at- 
tribute of the write through cycle, it drives the 
MKEN# signal accordingly. The MBC also drives 
the KWEND# signal at this time, indicating the end 
of the cacheability window. The 82495XP samples 
MKEN# active during KWEND# (clock 3), indicat- 
ing that the missed line should be allocated in the 
cache. 

The MBC asserts SWEND# (clock 5) when the 
snoop window of the write through cycle ends on the 
memory bus. Note that the direct to [M] state qualifi- 
er signal (DRCTM#) is sampled during SWEND# 
and is inactive for the write . The MBC also issued 
CRDY# to the 82495XP at this time so that the 
82495XP thinks the write cycle completed on the 
memory bus when, in fact, it did not. 

In this example, the 82495XP requests the allocation 
cycle by issuing CADS# in clock 5. The cycle con- 
trol signals are vaiid at this point: MCACHE# is ac- 
tive, indicating the cacheability of the line-fill cycle; 
RDYSRC is not active, indicating that the MBC need 
not supply BRDY#s to the CPU (no BRDY#s are 
necessary for an allocation cycle). 

Once again, the MBC arbitrates for the memory bus 
and returns BGT# asserted (clock 6) for the alloca- 
tion cycle. The MBC asserts CNA#, KWEND#, and 
SWEND# (clock 9) to pipeline the memory bus and 
close the cacheability and snoop windows. Note that 
(for this example) DRCTM# is asserted during 
SWEND# to place the line in the modified state. 
Since this is done, all other caches must invalidate 
their copies. 

CRDY# for the allocation (line-fill) cycle is issued by 
the MBC in clock 1 1 to complete the read cycle on 
the memory bus and place the data into the 
82490XP cache array. 

MEMORY BUS SIGNALS: 

The memory address latch enables (MALE and 
MBALE) may remain asserted by the MBC to place 
the address latches in the flow through mode. If the 
82495XP is the current bus master, the memory ad- 
dress output enables (MAOE# and MBAOE#) 
should be asserted by the MBC. 

For Clocked Memory Bus Mode, the memory data 
output enable (MDOE#) has been asserted by the 
MBC to drive the memory data outputs. 

The MBC asserts MSEL# (clock 2) to allow sam- 
pling of MBRDY# and to latch MZBT# and MFRZ# 
for the write. MBRDY# and MEOC# are asserted 
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by the MBC (clock 3) to place the write data into the 
memory cycle buffers, sample MZBT# and MFRZ# 
for the next transfer, and end the current cycle on 
the memory bus. MFRZ# is driven active by the 
MBC here, indicating to the 82495XP that the data 
of the write through will be merged with the following 
allocation data. 

For the allocation (line fill) cycle, MSEL# is driven 
active again by the MBC (clock 6) to allow sampling 
of MBRDY# and to latch MZBT# for the transfer. 
MZBT# is sampled on all MCLK edges where 
MSEL# is inactive. Once MSEL# is sampled active 
by the 82495XP, the value of MZBT# sampled on 
the prior MCLK is used for the next transfer. 
MDOE# is also deasserted in clock 6 to allow the 
data pins to be used as inputs for the allocation cy- 
cle. 

MBRDY# is driven active by the MBC in clocks 7 to 
9 to cause the memory burst counter to be incre- 
mented and data to be placed into the 82490XP 
cache memory cycle buffers. During the line fill, the 
82490XP will merge the data from the .write through 
buffer with the incoming data from either main mem- 
ory or another cache (if that line was a write hit to 
[M] in another cache). 

The MBC drives MEOC# asserted (clock 10) to end 
the allocation cycle on the memory bus and switch 
memory cycle buffers for the new cycle. MZBT# is 
sampled at this time for the next data transfer. 

For Strobed Memory Bus Mode, the memory data 
output enable (MDOE#) has been asserted by the 
MBC to drive the memory data outputs. 

The MBC asserts MSELrf (clock 2) to allow toggling 
of MISTB and to latch MZBT# and MFRZ# for the 
write (on MSEL# falling edge). MISTB is toggled 
and MEOC# asserted by the MBC (clock 2) to place 
the write data into the memory cycle buffers, sample 
MZBT# and MFRZ# for the next transfer (on the 
falling edge of MEOC# while MSEL# is active), and 
end the current cycle on the memory bus. MFRZ# is 
driven active by the MBC here, indicating to the 
82495XP that the data of the write through will be 
merged with the following allocation data. 

For the allocation (line fill) cycle, MSEL# is driven 
active again by the MBC (clock7) to allow sampling 
of MOSTB and to latch MZBT# for the transfer. 
MDOE# is also deasserted in clock 7 to allow the 
data pins to be used as inputs for the allocation cy- 
cle. 

MOSTB is toggled by the MBC in clocks 8 to 10 to 
cause the memory burst counter to be incremented 



and data to be placed into the 82490XP cache mem- 
ory cycle buffers. During the line fill, the 82490XP 
will merge the data from the write through buffer with 
the incoming data from either main memory or an- 
other cache (if that line was a write hit to [M] in 
another cache). 

The MBC drives MEOC# asserted (clock 1 1) to end 
the allocation cycle on the memory bus and switch 
memory cycle buffers for the new cycle. MZBT# is 
sampled at this time for the next data transfer. 



8.5 .1/0 Cycles 

Figure 8-12 illustrates CPU initiated I/O cycles, both 
read and write. I/O writes are the only write cycles 
not posted by the 82495XP/82490XP cache (ie. the 
cycle is not fully acknowledged to the CPU until it 
has completed on the memory bus). 

CACHE CONTROL SIGNALS: 

The CPU initiates an I/O write cycle to the 
82495XP/82490XP. The 82495XP then issues 
CADS# and CDTS# (clock 1) and the associated 
cycle control signals to the MBC (eg. CW/R#, CM/ 
IO#, CD/C#, RDYSRC, MCACHE#). MCACHE# in 
not active, indicating that the cycle is not cacheable; 
RDYSRC is active, indicating that the MBC must 
supply BRDY#s to the CPU/Cache core. 

The memory bus address (MSET[10:0], 
MTAG[11:0], MCFA[6:0]) is valid with CADS# 
(clocks 1 and 10 for the two read s in this example) 
and remain valid until after CNA# is sampled active 
by the 82495XP (clocks 6 and 17). MALE and MBA- 
LE may be used to hold the address as necessary. 

The MBC arbitrates for the memory bus and returns 
BGT# asserted (clock 2) for the I/O write cycle, in- 
dicating that the cycle is guaranteed to complete on 
the memory bus. Once the 82495XP samples BGT# 
asserted, it must finish that cycle on the memory 
bus. Prior to this point, the cycle can be aborted by a 
snoop hit from another cache. 

CNA# for the write cycle is asserted by the MBC 
(clock 5) to indicate that it is ready to schedule a 
new memory bus cycle. Note that SWEND# and 
KWEND# are not needed for I/O cycles since they 
are not cacheable. 

The MBC asserts BRDY# in clock 7 to complete the 
I/O write cycle from the CPU, and CRDY# in clock 8 
to complete the cycle on the memory bus from the 
82495XP/82490XP cache. 
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Figure 8-12. I/O Write and Read Cycles 
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A new CADS# is issued from the 82495XP in clock 
10 for an I/O read cycle, along with the associated 
cycle control signals. MCACHE# is again not active, 
and RDYSRC is again active. 

The MBC returns BGT# asserted right away (clock 
11). The 82495XP can pipeline I/O cycles, but does 
not for the I/O read in this example. 

Upon completing the access on the memory bus, 
the MBC activates BRDY# (clock 17) and CRDY# 
(clock 16). Note that BRDY# of a cycle may come 
before (as in the I/O write cycle of this example), 
with or after the CRDY# of the same cycle. 

MEMORY BUS SIGNALS: 

The memory address latch enables (MALE and 
MBALE) may remain asserted by the MBC to place 
the address latches in flow through mode. If the 
82495XP is the current bus master, the memory ad- 
dress output enables (MAOE# and MBAOE#) 
should be asserted by the MBC. 

For Clocked Memory Bus Mode, The memory data 
output enable signal (MDOE#) is asserted by the 
MBC in clock 3 to drive the memory data outputs. 

MEOC# is asserted by the MBC (clock 5) to latch 
MZBT# for the I/O write transfer, and end that cycle 
on the memory bus (MBRDY# is not necessary 
since this example shows a single transfer cycle). 
MZBT# is driven high by the MBC in order to force 
the write cycle to begin with the correct burst ad- 
dress. MFRZ# is also sampled here (it need not be 
active since the cycle is not potentially allocatable). 

For the I/O read cycle, MDOE# is deasserted (clock 
12) by the MBC to allow the data pins to be used as 
inputs. 

MSEL# is driven active by the MBC (clock 12) to 
allow sampling of MBRDY# and to latch MZBT# for 
the transfer. MZBT# is sampled on all MCLK edges 
where MSEL# is inactive. Once MSEL# is sampled 
active by the 82495XP, the value of MZBT# sam- 
pled on the prior MCLK is used for the next transfer. 
Again, MZBT# is driven high by the MBC to force 
the transfer to begin with the correct burst address. 

The MBC asserts MBRDY# (clock 14) to cause the 
memory burst counter to be incremented and data to 
be placed into the 82490XP cache memory cycle 
buffers. The MBC drives MEOC# asserted (clock 
15) to end the read cycle on the memory bus and 
switch memory cycle buffers for a new cycle. 
MZBT# for the next transfer is latched at this time. 



For Strobed Memory Bus Mode, The memory data 
output enable signal (MDOE#) has been asserted 
by the MBC to drive the memory data outputs. 

MEOC# is asserted by the MBC (clock 5) to latch 
MZBT# for the I/O write transfer (on MEOC# falling 
edge), and end that cycle on the memory bus 
(MOSTB is not necessary since this example shows 
a single transfer cycle). MZBT# is driven high by the 
MBC in order to force the write cycle to begin with 
the correct burst address. MFRZ# is also sampled 
here (it need not be active since the cycle is not 
potentially allocatable). 

For the I/O read cycle, MDOE# is deasserted (clock 
10) by the MBC to allow the data pins to be used as 
inputs. 

MSEL# is driven active by the MBC (clock 10) to 
allow operation of MISTB and to latch MZBT# for 
the transfer (on MSEL# falling edge). Again, 
MZBT# is driven high by the MBC to force the trans- 
fer to begin with the correct burst address. 

The MBC toggles MISTB (clock 15) to cause the 
memory burst counter to be incremented and data to 
be placed into the 82490XP cache memory cycle 
buffers for the I/O read cycle. Note: MISTB latches 
the memory bus data on both the rising and falling 
edges. The MBC drives MEOC# asserted (clock 16) 
to end the read cycle on the memory bus and switch 
memory cycle buffers for a new cycle. MZBT# for 
the next transfer is latched at this time (on the falling 
edge of MEOC#). 



8.6 LOCKed Cycles 

8.6.1 CPU READ MODIFY WRITE CYCLES 

The 82495XP provides a facility to allow atomic ac- 
cesses requested by the CPU (via CPU LOCK# acti- 
vation) through the 82495XP KLOCK# signal. Fig- 
ure 8-13 illustrates two back-to-back CPU initiated 
Locked read-modify-write cycles. KLOCK# activa- 
tion indicates to the MBC that the memory bus 
should not be released between the KLOCKed cy- 
cles. KLOCK# will remain asserted from the begin- 
ning of the first cycle (with CADS#) until one clock 
after the CADS of the last cycle. The 82495XP does 
not distinguish between back-to-back locked opera- 
tions and will not open an arbitration window (deas- 
sert KLOCK#) between them. It is the responsibility 
of the MBC to distinguish between the multiple RMW 
sequences, if it is so desired. 
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Figure 8-13. LOCKed Read-Modify-Write Cycles 
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The 82495XP issues a request for a memory bus 
access (CADS #) for every locked cycle (read or 
write) regardless if it hits the cache tag state or not. 
Locked read cycles are treated by the 82495XP as 
cache misses, and , if the line is in the [M] state, the 
82495XP ignores the data on the memory bus and 
uses the data in the 82490XP array. Locked write 
cycles are treated as write through, and the tag state 
does not change even if the line is in the 82490XP 
array. 

CACHE CONTROL SIGNALS: 

The CPU initiates a Locked read cycle to the 
82495XP/82490XP cache where, due to the asser- 
tion of CPU LOCK#, it assumes a cache miss and 
issues CADS# to the MBC (clock 1) along with the 
associated cycle control signals (eg. CW/R#, CM/ 
I0#, CD/C#, RDYSRC, MCACHE#). MCACHE# is 
never asserted for LOCKed cycles; RDYSRC is ac- 
tive, indicating that the MBC must supply BRDY# to 
the CPU/Cache core. 

The memory bus address (MSET[10:0], 
MTAG[11:0], MCFA[6:0]) is valid with CADS# 
(clocks 1 and 5, then 7 and 1 1 for the two locked 
RMW sequences in this example) and remain valid 
until after CNA# is sampled active by the 82495XP 
(clocks 3 and 7, then 9 and 13). MALE and MBALE 
may be used to hold the address as necessary. 

The MBC arbitrates for the memory bus and returns 
BGT# asserted (clock 2), indicating that the cycle is 
guaranteed to complete on the memory bus. Once 
the 82495XP samples BGT# asserted, it must finish 
that cycle on the memory bus. Prior to this point, the 
cycle can be aborted by a snoop hit from another 
cache. 

CNA# for the read cycle is also asserted by the 
MBC (clock 2) to indicate that it may schedule a new 
memory bus cycle. Note that the cycle control sig- 
nals are not guaranteed to be valid after CNA# acti- 
vation. 

The MBC asserts BRDY# to the CPU/Cache core 
in clock 4. CRDY# for the locked read cycle is as- 
serted to the 82495XP/82490XP from the MBC 
(clock 5) to load the data stored in the 82490XP's 
memory cycle buffers into the cache array. If the 
read was to a dirty line, the 82495XP is intelligent 
enough to ignore the data in the memory cycle buff- 
ers and use the data in the cache array. 

Locked sequences always end in a write cycle, no 
new CPU initiated cycles may be inserted between 
the Locked read and Locked write cycles. Therefore, 



the 82495XP issues a new memory cycle request 
(CADS# in clock 5) for the Locked write as soon as 
it completes the Locked read cycle. The cycle con- 
trol signals are also valid at this time. RDYSRC is not 
active, indicating that the 82495XP will supply 
BRDY# to the CPU. 

The locked write cycle is posted like any other mem- 
ory write cycle. 

In this example, the CPU initiates a second read- 
modify-write cycle immediately. KLOCK# is not 
deasserted between the back-to-back locked se- 
quences since the CPU LOCK# remains asserted. If 
snooping is required between these cycles, it is the 
MBC responsibility to predict this boundary and al- 
low snooping. The 82495XP issues a memory bus 
request (CADS#) in clock 7 for the second locked 
read cycle, along with the new cycle control signals. 

The second locked RMW sequence repeats the ac- 
tions of the first. It's purpose in this example is to 
demonstrate that an arbitration window may not 
open between locked sequences if they follow one 
another with no idle or non-locked cycles between 
them. 

MEMORY BUS SIGNALS: 

The memory address latch enables (MALE and 
MBALE) may remain asserted by the MBC to place 
the address latches in flow through mode. If the 
82495XP is the current bus master, the memory ad- 
dress output enables (MAOE# and MBAOE#) 
should be asserted by the MBC. 

For Clocked Memory Bus Mode, MSEL# is driven 
active by the MBC (clock 3) to allow sampling of 
MBRDY# and to latch MZBT# for the transfer. 
MZBT# is sampled on all MCLK edges where 
MSEL# is inactive. Once MSEL# is sampled active 
by the 82495XP, the value of MZBT# sampled on 
the prior MCLK is used for the next transfer. 

The memory data output enable signal (MDOE#) 
must be inactive to allow the data pins to be used as 
inputs for the first locked read cycle. The MBC as- 
serts MEOC# (clock 4) to latch MZBT# for the next 
transfer, and end the current locked read cycle on 
the memory bus (MBRDY# is not necessary since 
this example shows a single transfer cycle). MZBT# 
is driven high by the MBC in order to force the read 
cycle to begin with the correct burst address. 

For the locked write cycle, MDOE# is asserted by 
the MBC (clock 5) to drive the memory data outputs. 
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MEOC# is again asserted (clock 6) to latch MZBT# 
for the next transfer, and end the current locked 
write cycle on the memory bus (MBRDY# is not 
necessary since this is a single transfer cycle). 
MZBT# is again driven high. MFRZ# is also sam- 
pled during write cycles when MEOC# is sampled 
active by the 82495XP. 

MDOE# is deasserted by the MBC (clock 7) to allow 
the data pins to be used as inputs for the second 
locked read cycle. MEOC# is again asserted (clock 
8) to latch MZBT# for the next transfer, and end the 
locked read cycle on the memory bus. MZBT# is 
again driven high. 

MDOE# is asserted by the MBC (clock 9) to drive 
the memory data outputs for the second locked write 
cycle. MBRDY# is asserted (clock 13) to cause the 
memory burst counter to be incremented and data to 
be placed into the 82490XP cache memory cycle 
buffers. The MBC drives MEOC# active and 
MSEL# inactive (clock 14) to end the locked write 
cycle on the memory bus and switch memory cycle 
buffers for a new cycle. MZBT# and MFRZ# for the 
next transfer are sampled at this time. 

For Strobed Memory Bus Mode, MSEL# is driven 
active by the MBC (clock 1) to allow sampling of 
MxSTB and to latch MZBT# for the first locked read 
transfer (on the falling edge of MSEL#). 

The memory data output enable signal (MDOE#) 
must be inactive to allow the data pins to be used as 
inputs for the first locked read cycle. The MBC as- 
serts MEOC# (clock 3) to latch MZBT# for the next 
transfer (on MEOC# falling edge while MSEL# is 
active), and end the current locked read cycle on the 
memory bus (MISTB is not necessary since this ex- 
ample shows a single transfer cycle). MZBT# is 
driven high by the MBC in order to force the read 
cycle to begin with the correct burst address. 

For the locked write cycle, MDOE# is asserted by 
the MBC (clock 4) to drive the memory data outputs. 
MEOC# is again asserted (clock 6) to latch MZBT# 
for the next transfer, and end the current locked 
write cycle on the memory bus (MOSTB is not nec- 
essary since this is a single transfer cycle). MZBT# 
is again driven high. MFRZ# is also sampled on the 
falling edge of MEOC#. 

MDOE# is deasserted by the MBC (clock 7) to allow 
the data pins to be used as inputs for the second 
locked read cycle. MEOC# is again asserted (clock 
8) to latch MZBT# for the next transfer, and end the 
locked read cycle on the memory bus. MZBT# is 
again driven high. 



MDOE# is asserted by the MBC (clock 9) to drive 
the memory data outputs for the second locked write 
cycle. MOSTB is toggled (clock 12) to cause the 
memory burst counter to be incremented and data to 
be placed into the 82490XP cache memory cycle 
buffers. The MBC drives MEOC# active and 
MSEL# inactive (clock 13) to end the locked write 
cycle on the memory bus and switch memory cycle 
buffers for a new cycle. MZBT# and MFRZ# for the 
next transfer are sampled at this time. 



9.0 TESTABILITY 

Testing the 82495XP/82490XP chipset can be divid- 
ed into three categories: Built-in Self Test (BIST), 
Boundary Scan, and external testing. BIST performs 
basic device testing on the 82495XP. Boundary 
Scan provides additional test hooks that conform to 
the IEEE Standard Test Access Port and Boundary 
Scan Architecture (IEEE Std.1 149.1). Additional 
testing can be performed by using software written 
to test the 82490XP cache SRAM. 



9.1 Built-in Self Test (BIST) 

BIST tests the internal funcitonality of the 82495XP. 
The 82495XP's BIST tests approximately 90% of 
the cache controller. It tests the tag RAM and com- 
parators. 

The 82495XP BIST is initiated by driving 
SLFTST#(CRDY#) low and HIGHZ#(MBALE) high 
at least 10 clocks before RESET goes inactive. The 
82495XP Cache Controller reports the result of BIST 
on the CAHOLD signal. When the self test com- 
pletes, the 82495XP drives FSIOUT# inactive and 
the BIST result on CAHOLD. If CAHOLD is driven 
active the BIST successfully passed. If CAHOLD is 
driven inactive, BIST detected a flaw in the cache 
controller. CAHOLD is valid for one clock after 
FSIOUT# deactivation and should be sampled on 
the rising edge of FSIOUT#. 

On the 82495XP, BIST only informs the system that 
a failure did or did not occur. BIST is not able to 
indicate where a failure occurred. After completing 
BIST the cache controller perform reset and begin 
normal operation. 



9.2 Boundary Scan 

The 82495XP/82490XP chipset provides additional 
test ability features compatible with the IEEE Stan- 
dard Test Access Port and Boundary Scan Architec- 
ture (IEEE Std.1 149.1). The test logic provided al- 
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lows for testing to insure that components function 
correctly, that interconnections between various 
components are correct, and that various compo- 
nents interact correctly on the printed circuit board. 

The boundary scan test logic consists of a boundary 
scan register and support logic that are accessed 
through a test access port (TAP). The TAP provides 
a simple serial interface that makes it possible to 
test all signal traces with only a few probes. 

The TAP can be controlled via a bus master. The 
bus master can be either automatic test equipment 
or a component (PLD) that interfaces to the four-pin 
test bus. 



9.2.1 BOUNDARY SCAN ARCHITECTURE 

The boundary scan test logic contains the following 
elements: 

— Test access port (TAP), consisting of input pins 
TMS, TCK, and TDI; and ouput pin TDO. 

— TAP controller, which interprets the inputs on the 
test mode select (TMS) line and performs the 
corresponding operation. The operations per- 
formed by the TAP include controlling the in- 
struction and data registers within the compo- 
nent. 

— Instruction register (IR), which accepts instruc- 
tion codes shifted into the test logic on the test 
data input (TDI) pin. The instruction codes are 
used to select the specific test operation to be 
performed or the test data register to be ac- 
cessed. 

— Test data registers: The 82495XP/82490XP 
chipset components each contain three test data 
registers: Bypass register (BPR), Device Identifi- 
cation register (DID), and Boundary Scan regis- 
ter (BSR). 

The instruction and test data registers are separate 
shift-register paths connected in parallel and have a 
common serial data input and a common serial data 
output connected to the TAP signals, TDI and TDO, 
respectively. 



9.2.2 DATA REGISTERS 

The 82495XP and 82490XP both contain the two 
required test data registers; bypass register and 
boundary scan register. In addition, they also have a 
device identification register. 



Each test data register is serially connected to TDI 
and TDO, with TDI connected to the most significant 
bit and TDO connected to the least significant bit of 
the test data register. Data is shifted one stage (bit 
position within the register) on each rising edge of 
the test clock (TCK). 

9.2.2.1 Bypass Register 

The Bypass Register is a one-bit shift register that 
provides the minimal length path between TDI and 
TDO. This path can be selected when no test opera- 
tion is being performed by the component to allow 
rapid movement of test data to and from other com- 
ponents on the board. While the bypass register is 
selected data is transferred from TDI to TDO without 
inversion. 

9.2.2.2 Boundary Scan Register 

The Boundary Scan Register is a single shift register 
path containing the boundary scan cells that are 
connected to all input and output pins of the 
82495XP/82490XP chipset. Figure 9.1 shows the 
logical structure of the boundary scan register. While 
output cells determine the value of the signal driven 
on the corresponding pin, input cells only capture 
data; they do not affect the normal operation of the 
device. Data is transferred without inversion from 
TDI to TDO through the boundary scan register dur- 
ing scanning. The boundary scan register can be op- 
erated by the EXTEST and SAMPLE instructions. 
The boundary scan register order is described in 
section 9.2.5.. 

9.2.2.3 Device Identification Register 

The Device Identification Register contains the man- 
ufacturer's identification code, part number code, 
and version code in the format shown in Figure 9.2. 
Table 9.1 lists the codes corresponding to the 
82495XP and 82490XP. 

Table 9-1. Device ID Register Values 




Component 
Code 


Version 
Code 


Part 

Number 

Code 


Manufacturer 
Identity 


82495XP 
(A0orA1)0Ah 


0495h 


0495h 


09h 


82495XP (BO) 


OBh 


0495h 


09h 


82490XP 
(AOorAI) 


OOh 


49A0h 


09h 
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Figure 9-1. Boundary Scan Register Structure 
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Figure 9-2. Device ID Register 



9.2.2.4 Runbist Register 

The Runbist Register is a one bit register used to 
report the results of the 82495XP BIST when it is 
initiated by the RUNBIST instruction. This register is 
loaded with a "1" prior to invoking the BIST and is 
loaded with "1" upon successfull completion. "0" 
indicates a failure occurred during BIST. 

NOTE: 

82495XP RUNBIST is not available in the A-step- 
ping. 



9.2.3 INSTRUCTION REGISTER 

The Instruction Register (IR) allows instructions to 
be serially shifted into the device. The instruction 
selects the particular test to be performed, the test 
data register to be accessed, or both. The instruc- 
tion register is four (4) bits wide. The most significant 
bit is connected to TDI and the least significant bit is 
connected to TDO. There are no parity bits associat- 
ed with the Instruction register. Upon entering the 
Capture-IR TAP controller state, the Instruction reg- 
ister is loaded with the default instruction "0001", 
SAMPLE/PRELOAD. Instructions are shifted into 
the instruction register on the rising edge of TCK 
while the TAP controller is in the Shift-IR state. 
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9.2.3.1 82495XP Boundary Scan Instruction Set 

The 82495XP cache controller supports all three 
mandatory boundary scan instructions (BYPASS, 
SAMPLE/PRELOAD, and EXTEST) along with one 
optional instruction (IDCODE). On the B-Stepping of 
the 82495XP two additional optional instructions will 
be implemented (RUNBIST and TRISTATE). Table 
9.3 lists the 82495XP boundary scan instruction 
codes. The instructions listed as PRIVATE cause 
TDO to become enabled in the Shift-DR state and 
cause "0" to be shifted out of TDO on the rising 
edge of TCK. Execution of the PRIVATE instructions 
will not cause hazardous operation of the 82495XP. 
Note that system tests should not execute instruc- 
tion codes labeled "RESERVED". These instruc- 
tions can put the component in an undeterminant 
state which can only be cleared by power on reset. 



Table 9-2. 82495XP Boundary Scan 
Instruction Codes 


Instruction Code 


Instruction Name 


0000 


EXTEST 


0001 


SAMPLE 


0010 


IDCODE 


0011 


RESERVED 


0100 


RESERVED 


0101 


RESERVED 


0110 


RESERVED 


0111 


*RUNBIST 


1000 


*TRISTATE 


1001 


RESERVED 


1010 


PRIVATE 


1011 


PRIVATE 


1100 


PRIVATE 


1101 


PRIVATE 


1110 


PRIVATE 


1111 


BYPASS 



* RUNBIST and TRISTATE are boundary scan instructions 
that will be implemented in the B-stepping of the 82495XP. 
They are not available on the A-stepping. 

EXTEST The instruction code is "0000". The EX- 
TEST instruction allows testing of cir- 
cuitry external to the component pack- 
age, typically board interconnects. It 
does so by driving the values loaded 
into the 82495XP boundary scan regis- 
ter out on the output pins corresponding 
to each boundary scan cell and cap- 



turing the values on 82495XP input pins 
to be loaded into their corresponding 
boundary scan register locations. I/O 
pins are selected as input or output, de- 
pending on the value loaded into their 
control setting locations in the boundary 
scan register. Values shifted into input 
latches in the boundary scan register 
are never used by the internal logic of 
the 82495XP. Note: after using the EX- 
TEST instruction, the 82495XP must be 
reset before normal (non-boundary 
scan) use. 

SAMPLE/ The instruction code is "0001". The 
PRELOAD SAMPLE/PRELOAD has two functions 
that it performs. When the TAP control- 
ler is in the Capture-DR state, the SAM- 
PLE/PRELOAD instruction allows a 
"snap-shot" of the normal operation of 
the component without interfering with 
that normal operation. The instruction 
causes boundary scan register cells as- 
sociated with outputs to sample the val- 
ue being driven by the 82495XP. It caus- 
es the cells associated with inputs to 
sample the value being driven into the 
82495XP. On both outputs and inputs 
the sampling occurs on the rising edge 
of TCK. When the TAP controller is in 
the Update-DR state, the SAMPLE/ 
PRELOAD instruction preloads data to 
the device pins to be driven to the board 
by executing the EXTEST instruction. 
Data is preloaded to the pins from the 
boundary scan register on the falling 
edge of TCK. 

IDCODE The instruction code is "0010". The ID- 
CODE instruction selects the device 
identification register to be connected to 
TDI and TDO, allowing the devices iden- 
tification code to be shifted out of the 
device on TDO. Note that the device 
identification register is not altered by 
data being shifted in on TDI. 

BYPASS The instruction code is "1111 ". The BY- 
PASS instruction selects the bypass 
register to be connected to TDI and 
TDO, effectively bypassing the test logic 
on the 82495XP by reducing the shift 
length of the device to one bit. Note that 
an open circuit fault in the board level 
test data path will cause the bypass reg- 
ister to be selected following an instruc- 
tion scan cycle due to the pull-up resis- 
tor on the TDI input. This has been done 
to prevent any unwanted interference 
with the proper operation of the system 
logic. 
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RUNBIST The instruction code is "0111". The 
RUNBIST instruction selects the one (1) 
bit runbist register, loads a value of "0" 
into the runbist register, and connects it 
to TDO. It also initiates the built-in self 
test (BIST) feature of the 82495XP, 
which is able to detect approximately 
90% of the stuck-at faults on the 
82495XP. The 82495XP ac/dc specifi- 
cations for VCC and CLK must be met 
and reset must have been asserted at 
least once prior to executing the 
RUNBIST boundary scan instruction. Af- 
ter loading the RUNBIST instruction 
code in the instruction register, the TAP 
controller must be placed in the Run- 
Test/ Idle state. BIST begins on the first 
rising edge of TCK after entering the 
Run-Test/ldle state. The TAP controller 
must remain in the Run-Test/ Idle state 
until BIST is completed. It requires 100K 
clock (CLK) cycles to complete BIST 
and report the result to the runbist regis- 
ter. After completing the 100K clock 
(CLK) cycles, the value in the runbist 
register should be shifted out on TDO 
during the Shift-DR state. A value of "1 " 
being shifted out on TDO indicates BIST 
successfully completed. A value of "0" 
indicates a failure occurred. After exe- 
cuting the RUNBIST instruction, the 
82495XP must be reset prior to normal 
operation. NOTE: This instruction is not 
available on the A-stepping of the 
82495XP. It will be implemented in the 
B-stepping. 

TRISTATE The instruction code is "1000". The 
TRISTATE instruction initiates the tri- 
state output test mode. After loading the 
TRISTATE boundary scan instruction 
into the instruction register, the TAP 
controller must be placed in the Run- 
Test/ Idle state. To terminate the tristate 
output test mode, the 82495XP must be 
reset. NOTE: This instruction is not 
available on the A-stepping of the 
82495XP. It will be implemented in the 
B-stepping. 

9.2.3.2 82490XP Boundary Scan Instruction Set 

The 82490XP cache controller supports all three 
mandatory boundary scan instructions (BYPASS, 
SAMPLE/PRELOAD, and EXTEST) along with one 
optional instruction (IDCODE). Table 9.4 lists the 
82490XP boundary scan instruction codes. The in- 
structions listed as PRIVATE cause TDO to become 
enabled in the Shift-DR state and cause "0" to be 



shifted out of TDO on the rising edge of TCK. Execu- 
tion of the PRIVATE instructions will not cause haz- 
ardous operation of the 82490XP. Note that system 
tests should not execute instruction codes labeled 
"INTEL RESERVED". These instructions can put 
the component in an undeterminant state which can 
only be cleared by power on reset. 

Table 9-3. 82490XP Boundary Scan 
Instruction Codes 



Instruction Code 


Instruction Name 


0000 


EXTEST 


0001 


SAMPLE 


0010 


IDCODE 


0011 


INTEL RESERVED 


0100 


INTEL RESERVED 


0101 


INTEL RESERVED 


0110 


INTEL RESERVED 


0111 


INTEL RESERVED 


1000 


INTEL RESERVED 


1001 


INTERL RESERVED 


1010 


PRIVATE 


1011 


PRIVATE 


1100 


PRIVATE 


1101 


PRIVATE 


1110 


PRIVATE 


1111 


BYPASS 



EXTEST The instruction code is "0000". The EX- 
TEST instruction allows testing of cir- 
cuitry external to the component pack- 
age, typically board interconnects. It 
does so by driving the values loaded 
into the 82490XP boundary scan regis- 
ter out on the output pins corresponding 
to each boundary scan cell and captur- 
ing the values on 82490XP input pins to 
be loaded into their corresponding 
boundary scan register locations. I/O 
pins are selected as input or output, de- 
pending on the value loaded into their 
, control setting locations in the boundary 
scan register. Values shifted into input 
latches in the boundary scan register 
are never used by the internal logic of 
the 82490XP. Note: after using the EX- 
TEST instruction, the 82490XP must be 
reset before normal (non-boundary 
scan) use. 
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SAMPLE/ The instruction code is "0001". The 
PRELOAD SAMPLE/PRELOAD has two functions 
that it performs. When the TAP control- 
ler is in the Capture-DR state, the SAM- 
PLE/PRELOAD instruction allows a 
"snap-shot" of the normal operation of 
the component without interfering with 
that normal operation. The instruction 
causes boundary scan register cells as- 
sociated with outputs to sample the val- 
ue being driven by the 82490XP. It caus- 
es the cells associated with inputs to 
sample the value being driven into the 
82490XP. On both outputs and inputs 
the sampling occurs on the rising edge 
of TCK. When the TAP controller is in 
the Update-DR state, the SAMPLE/ 
PRELOAD instruction preloads data to 
the device pins to be driven to the board 
by executing the EXTEST instruction. 
Data is preloaded to the pins from the 
boundary scan register on the falling 
edge of TCK. 

IDCODE The instruction code is "0010". The ID- 
CODE instruction selects the device 
identification register to be connected to 
TDI and TDO, allowing the devices iden- 
tification code to be shifted out of. the 
device on TDO. Note that the device 
identification register is not altered by 
data being shifted in on TDI. 

BYPASS The instruction code is "1 1 1 1 ". The BY- 
PASS instruction selects the bypass 
register to be connected to TDI and 
TDO, effectively bypassing the test logic 
on the 82490XP by reducing the shift 
length of the device to one bit. Note that 
an open circuit fault in the board level 
test data path will cause the bypass reg- 
ister to be selected following an instruc- 
tion scan cycle due to the pull-up resis- 
tor on the TDI input. This has been done 
to prevent any unwanted interference 
with the proper operation of the system 
logic. 

9.2.4 TEST ACCESS PORT (TAP) 
CONTROLLER 

The TAP controller is a synchronous, finite state ma- 
chine. It controls the sequence of operations of the 
test logic. The TAP controller changes state only in 
response to the following events: 

1 . A rising edge of TCK 

2. Power-up. 



The value of the test mode state (TMS) input signal 
at a rising edge of TCK controls the sequence of the 
state changes. The state diagram for the TAP con- 
toller is shown in figure 9.3. Test designers must 
consider the operation of the state machine in order 
to design the correct sequence of values to drive on 
TMS. 



9.2.4.1 Test-Logic-Reset State 

In this state, the test logic is disabled so that normal 
operation of the device can continue unhindered. 
This is achieved by initializing the instruction register 
such taht the IDCODE instruction is loaded. No mat- 
ter what the original state of the controller, the con- 
troller enters Test-Logic-Reset state when the TMS 
input is held high (1) for at least five rising edges of 
TCK. The controller remains in this state while TMS 
is high. The TAP controller is also forced to enter 
this state at power-up. 



9.2.4.2 Run-Test/ldle State 

A controller state between scan operations. Once in 
this state, the controller remains in this state as 
long as TMS is held low. In devices supporting the 
RUNBIST instruction, the BIST is performed during 
this state and the result is reported in the runbist 
register. For instructions not causing functions to ex- 
ecute during this state, no activity occurs in the test 
logic. The instruction register and all test data regis- 
ters retain their previous state. When TMS is high 
and a rising edge is applied to TCK, the controller 
moves to the Select-DR state. 



9.2.4.3 Select-DR-Scan State 

This is a temporary controller state. The test data 
register selected by the current instruction retains its 
previous state. If TMS is held low and a rising edge 
is applied to TCK when in this state, the controller 
moves into the Capture-DR state, and a scan se- 
quence for the selected test data register is initiated. 
If TMS is held high and a rising edge is applied to 
TCK, the controller moves to the Select-IR-Scan 
state. 

The instruction does not change in this state. 
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Figure 9-3. Tap Controller State Diagram 



9.2.4.4 Capture-DR State 

In this state, the boundary scan register captures 
input pin data if the current instruction is EXTEST or 
SAMPLE/PRELOAD. The other test data registers, 
which do not have parallel input, are not changed. 

The instruction does not change in this state. 

When the TAP controller is in this state and a rising 
edge is applied to TCK, the controller enters the 
Exit1-DR state if TMS is high or the Shift-DR state if 
TMS is low. 



9.2.4.5 Shift-DR State 

In this controller state, the test data register con- 
nected between TDI and TDO as a result of the cur- 
rent instruction, shifts data one stage toward its seri- 
al output on each rising edge of TCK. 

The instruction does not change in this state. 

When the TAP controller is in this state and a rising 
edge is applied to TCK, the controller enters the 
Exit1-DR state if TMS is high or remains in the Shift- 
DR state if TMS is low. 
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9.2.4.6 Exit1-DR State 

This is a temporary state. While in this state, if TMS 
is held high, a rising edge applied to TCK causes the 
controller to enter the Update-DR state, which termi- 
nates the scanning process. If TMS is held low and a 
rising edge is applied to TGK, the controller enters 
the Pause-DR state. 

The test data register selected by the current instru- 
citon retains its previous value during this state. The 
instruction does not change in this state. 



9.2.4.7 Pause-DR State 

The pause state allows the test controller to tempo- 
rarily halt the shifting of data through the test data 
register in the serial path between TDI and TDO. An 
example of using this state could be to allow a tester 
to reload its pin memory from disk during application 
of a long test sequence. 

The test data register selected by the current instru- 
citon retains its previous value during this state. The 
instruction does not change in this state. 

The controller remains in this state as long as TMS 
is low. Whne TMS goes high and a rising edge is 
applied to TCK, the controller moves to the Exit2-DR 
state. 



9.2.4.8 Exit2-DR State 

This is a temporary state. While in this state, if TMS 
is held high, a rising edge applied to TCK causes the 
controller to enter the Update-DR state, which termi- 
nates the scanning process. If TMS is held low and a 
rising edge is applied to TCK, the controller enters 
the Shift-DR state. 

The test data register selected by the current instru- 
ction retains its previous value during this state. The 
instruction does not change in this state. 



9.2.4.9 Update-DR State 

The boundary scan register is provided with a 
latched parallel output to prevent changes at the 
parallel output while data is shifted in response to 
the EXTEST and SAMPLE/PRELOAD instructions. 
When the TAP controller is in this state and the 
boundary scan register is selected, data is latched 
onto the parallel output of this register from the shift- 
register path on the falling edge of TCK. The data 
held at the latched parallel output does not change 
other than in this state. 



All shift-register stages in test data register selected 
by the current instruciton retains its previous value 
during this state. The instruction does not change in 
this state. 



9.2.4.10 Select-IR-Scan State 

This is a temporary controller state. The test data 
register selected by the current instruction retains its 
previous state. If TMS is held low and a rising edge 
is applied to TCK when in this state, the controller 
moves into the Capture-IR state, and a scan se- 
quence for the instruction register is initiated. If TMS 
is held high and a rising edge is applied to TCK, the 
controller moves to the Test-Logic-Reset state. 

The instruction does not change in this state. 



9.2.4.11 Capture-iR State 

In this controller state the shift register contained in 
the instruction register loads the fixed value "0001" 
on the rising edge of TCK. 

The test data register selected by the current instru- 
citon retains its previous value during this state. The 
instruction does not change in this state. 

When the controller is in this state and a rising edge 
is applied to TCK, the controller enters the Exit1-IR 
state if TMS is held high, or the Shift-IR state if TMS 
is held low. 



9.2.4.12 Shift-IR State 

In this state the shift register contained in the in- 
struction register is connected between TDI and 
TDO and shifts data one stage towards its serial out- 
put on each rising edge of TCK. 

The test data register selected by the current instru- 
citon retains its previous value during this state. The 
instruction does not change in this state. 

When the controller is in this state and a rising edge 
is applied to TCK, the controller enters the Exit1-IR 
state if TMS is held high, or remains in the Shift-IR 
state if TMS is held low. 



9.2.4.13 ExitMR State 

This is a temporary state. While in this state, if TMS 
is held high, a rising edge applied to TCK causes the 
controller to enter the Update-IR state, which termi- 
nates the scanning process. If TMS is held low and a 
rising edge is applied to TCK, the controller enters 
the Pause-IR state. 
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The test data register selected by the current instru- 
ction retains its previous value during this state. The 
instruction does not change in this state. 

9.2.4.14 Pause-IR State 

The pause state allows the test controller to tempo- 
rarily halt the shifting of data through the instruction 
register. 

The test data register selected by the current instru- 
citon retains its previous value during this state. The 
instruction does not change in this state. 

The controller remains in this state as long as TMS 
is low. When TMS goes high and a rising edge is 
applied to TCK, the controller moves to the Exit2-IR 
state. 



9.2.4.15 Exit2-IR State 

This is a temporary state. While in this state, if TMS 
is held high, a rising edge applied to TCK causes the 
controller to enter the Update-IR state, which termi- 
nates the scanning process. If TMS is held low and a 
rising edge is applied to TCK, the controller enters 
the Shift-IR state. 

The test data register selected by the current instru- 
ction retains its previous value during this state. The 
instruction does not change in this state. 



9.2.4.16 Update-IR State 

The instruction shifted into the instruction register is 
latched onto the parallel output from the shift-regis- 
ter path on the falling edge of TCK. Once the new 
instruction has been latched, it becomes the current 
instruction. 

Test data registers selected by the current instruc- 
tion retain the previous value. 



9.2.5 BOUNDARY SCAN REGISTER CELL 

The boundary scan register for each component 
contains a cell for each pin, as well as cells for con- 
trol of I/O and tristate pins. 



9.2.5.1 82495XP Boundary Scan Register Cell 

The following is the bit order of the 82495XP bound- 
ary scan register: (from left to right and top to bot- 
tom) 



TDI— >MKEN# KWEND# SWEND# BGT# 
CNA# BRDY# RESERVED CRDY# MWBWT# 
DRCTM# MRO# CWAY# FPFLD# SNPCYC# 
SNPBSY# MHITM# MTHIT# CAHOLD FSIOUT# 
PALLC# SNPADS# CADS# CDTS# CWR# 
CDC# CMIO# RDYSRC MCACHE# KLOCK# 
SMLN# NENE# CFA3 CFA2 TAG11 TAG10 TAG9 
TAG8 TAG7 TAG6 TAG5 TAG4 TAG3 TAG2 TAG1 
TAGO SET10 SET9 SET8 SET7 CLK SET6 SET5 
SET4 SET3 SET2 SET1 SETO CFA6 CFA5 CFA4 
CFA1 CFAO ADS# LEN BLAST # BRDYC1 # 
BRDYC2# CACHE# LOCK# BLE# BOFF# KEN# 
AHOLD WR# MIO# DC# PWT PCD HITM# PCYC 
EADS# NA# INV WBWT# WAY WRARR# 
MCYC# BUS# MAWEA# WBWE# WBA WBTYP 
MCFAO MCFA1 MCFA4 MCFA5 MCFA6 MSETO 
MSET1 MSET2 MSET3 MSET4 MSET5 MSET6 
MSET7 MSET8 MSET9 MSET10 MTAGO MTAG1 
MTAG2 MTAG3 MTAG4 MTAG5 MTAG6 MTAG7 
MTAG8 MTAG9 MTAG10 MTAG1 1 MCFA2 MCFA3 
RESET MAOE# MBAOE# SNPCLK SNPSTB# 
EWBE# MPIC# SNPINV FLUSH# SNYC# 
SNPNCA MBALE MALE MACTL OCTL CFA4CTL 
CFA5CTL CACTL FPFLDCTL WBWTCTL 
NACTL-^TDO 

"RESERVED" signals correspond to no connect 
"NC" signals on the 82495XP. 

EWBE# and MPIC# will be implemented in the 
82495XP B-stepping, omit from boundary scan reg- 
ister for A-stepping 82495XPs. 

All the *CTL cells are control cells that are used to 
select the direction of bidirectional pins or tristate 
output pins. If "1" is loaded into the control 
cell(*CTL), the associated pin(s) are tristated or se- 
lected as input. The following lists the control cells 
and their corresponding pins. 

1. MACTL controls the MSETO-10, MTAGO-11, 
and MCFAO-6 pins. 

2. OCTL controls the WAY, WRARR#, MCYC#, 
MAWEA#, BUS#, WBWE#, WBA, WBTYP, INV, 
EADS#, AHOLD, KEN#, BOFF#, BLE#, 
BRDYC2#, BRDYC1#, BLAST#, NENE#, 
SMLN#, KLOCK#, MCACHE#, 
CMIO#, CDC#, CWR#, CDTS#, 
SNPADS#, PALLC#, FSIOUT#, 
MTHIT#, MHITM#, SNPBSY#, SNPCYC#, 
CWAY, EWBE#, and MPIC# output pins. 

3. CFA4CTL controls the CFA4 pin. 

4. CFA5CTL controls the CFA5 pin. 

5. CACTL controls the SETO-10, TAGO-11, 
CFAO-3, and CFA6 pins. 

6. FPFLDCTL controls the FPFLD# pin. 

7. WBWTCTL controls the WB/WT# pin. 

8. NACTL controls the NA# pin. 



RDYSRC, 

CADS#, 

CAHOLD, 
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9.2.5.2 82490XP Boundary Scan Register Cell 

The following is the bit order of the 82490XP bound- 
ary scan register: (from left to right and top to bot- 
tom) 

TDI->CDCTL WR# BLAST # BRDYC# 
BRDY# HITM# ADS# BE# AO A1 A2 A3 A4 A5 A6 
A7 A8 A9 A10 A11 A12 A13 A14 A15 MDATA7 
MDATA3 MDATA6 MDATA2 MDATA5 MDATA1 
MDATA4 MDATAO MDCTL MDOE# MZBT# 
MBRDY# MOEC# MFRZ# MSEL# MCLK MOCLK 
RESET PAR# RESERVED BOFF# WBTYP WBA 
WBWE# BUS# MAWEA# MCYC# CRDY# 
WRARR# WAY CDATA4 CDATAO CDATA2 
CDATA5 CDATA6 CDATA1 CDATA3 

CDATA7 -> TDO 

"RESERVED" signals correspond to no connect 
"NC" signals on the 82490XP. 

All the *CTL cells are control cells that are used to 
select the direction of bidirectional pins or tristate 
output pins. If "1" is loaded into the control 
cell(*CTL), the associated pin(s) are tristated or se- 
lected as input. The following lists the control cells 
and their corresponding pins. 

1. CDCTL controls the CDATAO-7 pins. 

2. MDCTL controls the MDATAO-7 pins. 



9.2.6 TAP CONTROLLER INITIALIZATION 

The TAP controller is automatically intialized when a 
device is powered up. In addition, the TAP controller 
can be initialized by applying a high signal level on 
the TMS input for five TCK periods. 



9.2.7 



BOUNDARY SCAN SIGNAL DESCRIPTION 
AND TIMINGS 



The functionality of TDI, TMS, TDO, and TCK are 
described in Chapter 7. The A.C. timing specifica- 
tions for the boundary scan signals are located in 
Chapter 10. 



9.3 Tri-State Output Test Mode 

The 82495XP has the ability to tri-state all of its out- 
puts and bidirectional pins and to disable all pull-ups 
and pull-downs. During tri-state output test mode all 
pins floated during bus hold as well as those which 
are never floated during normal operation are 



tri-stated. When the 82495XP is in tri-state output 
test mode, external testing can be used to test 
board interconnections. 

On the 82495XP, tri-state output test mode is in- 
voked by driving HIGHZ#(MBALE) and SLFTST#- 
(CRDY#) active to the 82495XP at least 10 clocks 
prior to the deassertion of RESET. Note that 
■ HIGHZ# has priority over SLFTST#. When both 
HIGHZ# and SLFTST# are driven active the 
82495XP will invoke the tri-state output mode and 
not invoke BIST. 

Once tri-state output test mode is invoked, the 
82495XP remains in it until the next RESET. 



9.4 82490XP Cache SRAM Testing 

The 82490XP cache SRAM can be tested using 
standard cache memory testing techniques. Code 
must be written to: 

1. Flush and reset the 82495XP/82490XP/CPU 
cache 

2. Write 1 's to every bit of a block of memory equal 
to the cache size 

3. Read the block of memory to fill the cache, tag- 
ging the data as read-only using the MRO# sig- 
nal 

4. Write 0's to every bit in the block of memory 

5. Read the block, the cache hits should be all 1 's 

6. Repeat the process, exchanging for 1 and 1 for 


In this example, the code to test the cache must be 
non-cacheable to the 82495XP. Also, the CPU 
cache must be on so that the 82495XP will perform 
line-fills. 



10.0 AC/DC SPECIFICATIONS 



10.1 Background 

The 82495XP has four main interfaces: CPU Bus, 
memory bus controller, memory bus, and 82490XP. 
The memory bus controller is typically implemented 
with PLD devices. The MBC interface signal timings 
are, therefore, generated based on available, off- 
the-shelf PLD specs. The memory bus interface was 
specified to suit a generic memory interface which 
works up to CPU frequency. 
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10.2 D.C. Specifications 





Table 10-1. D.C. Specifications 




Vcc = 5V ±5%, Tease = Oto + 85°C 


Symbol 


Parameter 


Min 


Max 


Unit 


Notes 


V|L 


Input Low Voltage 


-0.3 


+ 0.8 


V 


TTL Level 


V| H 


Input High Voltage 2.0 


2.0 


Vcc + 0.3 


V 


TTL Level 


Vol 


Output Low Voltage 




0.45 


V 


TTL Level (1) 


VOH 


Output High Voltage 


2.4 




V 


TTL Level (2) 


'cc 


Power Supply Current 




550 
300 


mA 


82495XP @ 50 MHz, (3) 
82490XP @ 50 MHz 


Power 


Power Dissipation 




2.75 
1.50 


W 


82495XP @ 50 MHz, (4) 
82490XP @ 50 MHz 


Ili 


Input Leakage Current 




±15 


uA 


< V, N > Vcc 


Ilo 


Output Leakage Current 




±15 


uA 


^ Vqut ^ VccTristate 


Iil 


Input Leakage Current 




200 


uA 


V| N = 0.45V, (5) 


Qn 


Input Capacitance 




14 
5 


PF 


for82495XP 
for 82490XP 


Co 


Output Capacitance 




18 
15 


PF 


for82495XP 
for 82490XP 


C|/o 


I/O Capacitance 




18 
15 


PF 


for82495XP 
for82490XP 


CdK 


CLK I nput Capacitance 




14 
5 


PF 


for82495XP 
for82490XP 


Ctin 


Test Input Capacitance 




15 
10 


PF 


for82495XP 
for82490XP 


Ctout 


Test Output Capacitance 




15 
10 


PF 


for82495XP 
for 82490XP 


Ctck 


Test Clock Capacitance 




15 
10 


PF 


for82495XP 
for 82490XP 



NOTES: 

(1) Parameter measured at 4mA lload. 

For MCFA6-FCFA0, MSET10-MSET0, and MTAG11-MTAG0, this parameter is measured at 16 mA lload. 

(2) Parameter measured at 1 mA lload. 

For MCFA6-MCFA0, MSET10-MSET0, and MTAG1 1 -MTAG0, this parameter is measured at 2 mA lload. 

(3) Typical Supply current 400mA. 

(4) Typical Power dissipation is 2W. 

(5) This parameter is for input with pullup. 
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10.3 A.C. Specifications 

All TTL timing specs are measured at 1.5V for both "0" and "1" logic level. 

Table 10-2. Clock, Reset, and Configuration 



Vcc = 5V ± 5%, Tease = to + 85 °C 
Maximum CL = 50 pF unless otherwise specified. 
Minimum CL = 20 pF unless otherwise specified. 

All Inputs and Outputs are TTL Level. 


Symbol 


Parameter 


Min 


Max 


Unit 


Figure 


Notes 


to 


CLK, MCLK, MOCLK Frequency 


16.6 


50 


MHz 




1x clock 


■ t1 


CLK, MCLK, MOCLK Stability 




0.1 


% 






t2 


CLK, MCLK, MOCLK Period 


20 


60 


ns 


10-1 




t3 


CLK, MCLK, MOCLK High Time 


7 




ns 


10-1 


(D 


t4 


CLK, MCLK, MOCLK Low Time 


7 




ns 


10-1 


(D 


t5 


CLK, MCLK, MOCLK Rise Time 




2 


ns 




(D 


t6 


CLK, MCLK, MOCLK Fall Time 




2 


ns 




(D 


t7 


RESET Setup Time 


7 




ns 


10-4 




t8 


RESET Hold Time 


2 




ns 


10-4 




t9 


RESET Duration 


8xt2 
15xt2 




ns 


10-4 


for 82495XP, (2) 
for82490XP 


t10 


All Configurations CFG3-CFG0, 
CPUTYP, SNPMD, PLOCKEN, 
MEMLDRV, 82490XPLDRV, HIGHZ#, 
SLFTST# Setup Time 


10x12 




ns 


10-4 


(3), (4) 


t11 


All Configurations CFG3-CFG0, 
CPUTYP, SNPMD, PLOCKEN, 
MEMLDRV, 82490XPLDRV, HIGHZ#, 
SLFTST# Hold Time 







ns 


10-4 


(3), (5) 


t12 


FLUSH #, SYNC# Setup Time 


8 




ns 


10-3 


for 82495XP, (6) 


t13 


FLUSH#, SYNC# Hold Time 


1 




ns 


10-3 


for 82495XP, (7) 


t14 


FLUSH #, SYNC# Duration 


2xt2 




ns 




(8) 


t15 


MOCLK falling edge to MCLK rising edge 


2 




ns 






t16 


FERR#,HLDA Valid Delay 


2 


15 


ns 


10-2 




t17 


FERR#,HLDA Float Delay 




18 


ns 






t18 


HOLD, BOFF# Setup Time 


7 




ns 


10-3 




t19 


HOLD, BOFF# Hold Time 


2 




ns 


10-3 






NOTE: 

(1) Rise/Fall, High/Low times measured between 0.8V and 2.0V. 

(2) Power up reset duration should be 1 ms after Vcc and CLK are stable. If configuration inputs with pullups are left floated, 
10 us RESET duration is required. 

(3) Timing is referenced to reset falling edge. 

(4) 8ns setup time is required to guarantee recognition on next clock. 

(5) 1 ns hold time is required to guarantee recognition on next clock. 

(6) To guarantee recognition on next clock. 

(7) Synchronous mode only. 

(8) Asynchronous mode only. To guarantee recognition. 
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Table 10-3. Memory Bus Controller 82495XP/82490XP Interface 



Vcc = 5V ± 5%, Tease = to + 85 °C 
Maximum CL = 50 pF unless otherwise specified. 
Minimum CL = 20 pF unless otherwise specified. 

All Inputs and Outputs are TTL Level. 


Symbol 


Parameter 


Min 


Max 


Unit 


Figure 


Notes 


t30 


BRDY#, CRDY#, KWEND#, SWEND#, 
BGT#, CNA#, [WRMRST] Setup Time 


8 




ns 


10-3 


82495XPOnly 


t30a 


BRDY#, CRDY# Setup Time 


7 




ns 


10-3 


82490XP Only 


t31 


BRDY#, CRDY#, KWEND#, SWEND#, 
BGT#, CNA#, [WRMRST] Hold Time 


1 




ns 


10-3 


82495XP Only 


t32 


CW/R#, CD/C#, CMI/0#, RDYSRC, 
MCACHE#, KLOCK#, BLE#, PALLC#, 
CAHOLD, CWAY, FSIOUT#, CADS#, 
CDTS#, SNPADS# Valid Delay 


2 


12 


ns 


10-2 




t33 


NENE#, SMLN# Valid Delay 


2 


15 


ns 


10-2 




t34 


MDATA Setup to CLK (clock before 
BRDY# active) 


6 




ns 


10-3 




t35 


MDATA Valid Delay from CLK (CLK from 
CDTS# valid, MDOE# active) 


3 


15 


ns 


10-2 




t36 


MDATA Valid Delay from MDOE# active 




10 


ns 


10-2 




t37 


MDATA Fload Delay from MDOE# inactive 





14 


ns 







Table 10-4. 82495XP Memory Interface 



Vcc = 5V ± 5%, Tease = to + 85 °C 
Maximum CL = 50 pF unless otherwise specified. 
Minimum CL = 20 pF unless otherwise specified. 

All Inputs and Outputs are TTL Level. 


Symbol 


Parameter 


Min 


Max 


Unit 


Figure 


Notes 


t50 


SNPCLK Frequency 




50 


MHz 




1x clock (10) 


t51 


SNPCLK Period 


20 




ns 


10-1 


(11) 


t52 


SNPCLK High Time 


8 




ns 


10-1 




t53 


SNPCLK Low Time 


8 




ns 


10-1 




t54 


SNPCLK Rise Time 




2 


ns 




(D 


t55 


SNPCLK Fall Time 




2 


ns 




(D 


t56 


MCFA6-MCFA0, MSET10-MSET0, 
MTAG 1 1 - MTAG0 Valid Delay 


2 


13 


ns 


10-5 


(2), (3) 


t56 


MCFA6-MCFA0, MSET10-MSET0, 
MTAG1 1 -MTAG0 Float Delay 


2 


15 


ns 


10-5 


(4) 


t58 


MCFA6-MCFA0, MSET1 0-MSET0, 
MTAG11-MTAG0 Valid Delay . 


2 


15 


ns 


10-5 


(5) 
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Table 10-4. 82495XP Memory Interface (Continued) 



Vcc = 5V ± 5%, Tease = Oto +85°C 
Maximum CL = 50 pF unless otherwise specified. 
Minimum CL = 20 pF unless otherwise specified. 

All Inputs and Outputs are TTL Level. 


Symbol 


Parameter 


Min 


Max 


Unit 


Figure 


Notes 


t60 


MCFA6-MCFA0, MSET10-MSET0, MTAG11- 
MTAGO Valid Delay 


2 


15 


ns 


10-2 


(6), (12) 


t62a 


MCFA6-MCFA0, MSET10-MSET0, MTAG11- 
MTAGO, SNPINVV, SNPNCA, MAOE#, 
MBAOE # , SNPSTB # Setup Time 


8 




ns 


10-3 


(7a) 


t62b 


MCFA6-MCFA0, MSET10-MSET0, MTAG11- 
MTAGO, SNPINV, SNPNCA, MAOE#, MBAOE# 
Setup Time 


1 




ns 


10-3 


(7b) 


t62c 


MCFA6-MCFA0, MSET10-MSET0, MTAG11- 
MTAGO, SNPINV, SNPNCA, MAOE#, MBAOE#, 
SNPSTB # Setup Time 


8 




ns 


10-3 


(7c) 


t63a 


MCFA6-MCFA0, MSET10-MSET0, MTAG11- 
MTAGO, SNPINV, SNPNCA, MAOE#, MBAOE#, 
SNPSTB # Hold Time 


1 




ns 


10-3 


(7a) 


t63b 


MCFA6-MCFA0, MSET10-MSET0, MTAG11- 
MTAGO, SNPINV, SNPNCA, MAOE#, MBAOE# 
Hold Time 


8 




ns 


10-3 


(7b) 


t63c 


MCFA6-MCFA0, MSET10-MSET0, MTAG11- 
MTAGO, SNPINV, SNPNCA, MAOE#, MBAOE#, 
SNPSTB # Hold Time 


1 




ns 


10-3 


(7c) 


t64 


SNPSTB # Setup Time 


8 




ns 


10-3 


(8) 


t65 


SNPSTB # Hold Time 


1 




ns 


10-3 


(8) 


t66 


SNPSTB# Active/ Inactive Time 


8 




ns 


10-3 


(9) 


t67 


MRO#, MKEN#, DRCTM#,MWB/WT# Setup 
Time 


8 




ns 


10-3 




t68 


MRO#, MKEN#, DRCTM#, MWB/WT# Hold 
Time 


1 




ns 


10-3 




t69 


MTHIT#, MHITM#, SNPBSY#, SNPCYC# 
Valid Delay 


2 


13 


ns 


10-2 




t69a 


SNPCYC# Valid Delay 


2 


12 


ns 


10-2 






NOTES: 

(1) Rise/fall times measured between 0.45V and 2.4V 

(2) See capacitive derating curves for loads above the 50pF specification 

(3) Valid delay from MAOE#, MBAOE # going active (low) 

(4) Float delay from MAOE#, MBAOE # going inactive (high) 

(5) Valid delay from MALE or MBALE if both MAOE#, MBAOE# are active 

(6) Valid delay from CLK only if MALE or MBALE, MAOE# and MBAOE # are active 

(7) a. In clocked mode referenced to SNPCLK rising edge 

b. In strobed mode referenced to SNPSTB # falling edge 

c. In synchronous mode, refer to CLK 

(8) Asynchronous clocked mode only. Timings referenced to SNPCLK 

(9) Asynchronous signal. Time to guarantee recognition on next clock 

(10) SNPCLK is only used for the clocked memory bus mode 

(11) t51 > t2 

(12) This parameter is valid either from SNPCLK or CLK 
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Table 10-5. 82490XP Clocked Mode 



Vcc = 5V ± 5%, Tease = to + 85 °C 
Maximum CL = 50 pF unless otherwise specified. 
Minimum CL = 20 pF unless otherwise specified. 

All Inputs and Outputs are TTL Level. 


Symbol 


Parameter 


Min 


Max 


Unit 


Figure 


Notes 


t38 


MBRDY#, MSEL#, MEOC# Setup to MCLK 


5 




ns 


10-3 




t39 


MBRDY#, MSEL#, MEOC# Hold from MCLK 


2 




ns 


10-3 




t40 


MZBT#, MFRZ# Setup to MCLK 


5 




ns 


10-3 




t41 


MZBT#, MFRZ# Hold from MCLK 


2 




ns 


10-3 




t42 


MDATA Setup to MCLK 


5 




ns 


10-3 




t43 


MDATA Hold from MCLK 


3 




ns 


10-3 




t44 


MDATA Valid Delay from MCLK*MBRDY# 


2 


16 


ns 


10-2 




t45 


MDATA Valid Delay from MCLK*MEOC#, MCLK*MSEL# 


2 


20 


ns 


10-2 




t46 


MDATA Valid Delay from MOCLK 


2 


12 


ns 


10-2 





Table 10-6. 82490XP Strobed Mode 



Vcc = 5V ± 5%, Tease = to +85 °C 
Maximum CL = 50 pF unless otherwise specified. 
Minimum CL = 20 pF unless otherwise specified. 

All Inputs and Outputs are TTL Level. 


Symbol 


Parameter 


Min 


Max 


Unit 


Figure 


Notes 


t85 


MISTB.MOSTB High Time 


12 




ns 


10-6 




t86 


MISTB.MOSTB Low time 


12 




ns 


10-6 




t87 


MEOC# High time 


8 




ns 


10-6 




t88 


MEOC# Low time 


8 




ns 


10-6 




t89 


MxSTB, MEOC# Rise time 




2 


ns 




d) 


t90 


MxSTB, MEOC# Fall time 




2 


ns 




(1) 


t91 


MSEL# High time for restart 


8 




ns 


10-6 




t92 


MSEL# Setup before transition on MxSTB 


5 




ns 


10-8 




t93 


MSEL# Hold after transition on MxSTB 


10 




ns 


10-8 




t92 


MSEL# Hold after transition on MEOC# 


2 




ns 


10-8 




t95 


MxSTB transition to/from MEOC# falling transition 


10 




ns 






t96 


MZBT# Setup to MSEL# or MEOC# falling edge 


5 




ns 


10-7 




■t97 


MZBT# Hold from MSEL# or MEOC# falling edge 


2 




ns 


10-7 




t98 


MFRZ# Setup to MEOC# falling edge 


5 




ns 


10-7 




t99 


MFRZ# Hold from MEOC# falling edge 


2 




ns 


10-7 




t100 


MDATA Setup to MxSTB or MEOC# falling transition 


5 




ns 


10-7 




t101 


MDATA Hold from MxSTB or MEOC# falling transition 


2 




ns 


10-7 




t102 


MDATA Valid Delay from MxSTB transition 


2 


16 


ns 


10-9 




t103 


MDATA Valid Delay from MEOC# falling transition or 
MSEL# deactivation 


2 


20 


ns 


10-9 





NOTE: 

(1) Rise/Fall times are measured between 0.8V and 2.0V 
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Table 10-7. Test Mode 



Vcc = 5V ± 5%, Tease = to +85 °C 
Maximum CL = 50 pF unless otherwise specified. 
Minimum CL = 20 pF unless otherwise specified. 

All Inputs and Outputs are TTL Level. 


Symbol 


Parameter 


Min 


Max 


Unit 


Figure 


Notes 


t120 


TCK Frequency 




25 


MHz 




1x clock 


t121 


TCK Period 


40 




ns 




(2) 


t122 


TCK High Time 


10 




ns 




@ 2.0V 


t123 


TCK Low Time 


10 




ns 




@ 0.8V 


t124 


TCK Rise Time 




4 


ns 




(D 


t125 


TCK Fall Time 




4 


ns 




(D 


t126 


TDI.TMS Setup Time 


8 




ns 


10-10 




t127 


TDI, TMS Hold Time 


7 




ns 


10-10 




t128 


TDO Valid Delay 


3 


25 


ns 


10-10 




t129 


TDO Float Delay 












t130 


All Outputs Valid Delay 


3 


25 


ns 


10-10 


(3) 


t131 


All Outputs Float Delay 




36 


ns 


10-10 


(3) 



NOTES: 

(1) Rise/Fall times are measured between 0.8V and 2.0V Rise/Fall times can be relaxed by 1ns per 10ns increase in TCK 
period 

(2) TCK period ^ CLK period 

(3) Parameter measured from TCK 



CLK, SNPCLK 



t2, 51,71 



/ 


\ 


t4,53,73 


f t3,52, 72 ^ 









Figure 10-1. Clock Waveform 









tx 
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I VALID 


1 ' 240956-50 
tx = t1 6, 32, 33, 35, 36, 44, 45, 60, 69 



Figure 10-2. Valid Delay Timings 
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240956-51 
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Figure 10-3. Setup and Hold Timings 



Figure 10-3a. Setup and Hold Timings in 
Strobed Snooping Mode 




Figure 10-4. Reset and Configuration Timings 
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Figure 10-5. Memory Interface Signals 
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Figure 10-6. Active/Inactive Timing 
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Figure 10-7. Setup and Hold Timing 




Figure 10-8. Setup and Hold Timing 
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Figure 10-9. Valid Delay Timing 
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Figure 10-10. Test Timings 
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Introduction 

The i860™ 64-bit microprocessor is a general-purpose 
CPU with on-chip integer unit, floating point, memory 
management, caches, and graphics. The i860 micro- 
processor supports 3-D graphics software with the fol- 
lowing functions: 

1. Hidden surface elimination 

2. Distance interpolation 

3. Intensity interpolation for 3-D shading 

The fzchks (Z-buffer Check) and pst (Pixel Store) in- 
structions expedite hidden surface elimination. Dis- 
tance interpolation is accomplished with faddz (Add 
with Z merge), and intensity interpolation occurs with 
faddp (Add with Pixel Merge). The purpose of this ap- 
plication note is to illustrate the intended use of these 
instructions in a manner independent of any graphics 
environment in which the instructions might be used. It 
is not the purpose of this application note to present the 
most efficient instruction sequences. While the inner 
loop of Example 7 has as few instructions as logically 
possible, the other examples are intended to present 
general concepts, not optimum implementations. Tun- 
ing for maximum performance depends on the specific 
environment. 

This application note assumes familiarity with the 
i860™ 64-bit Microprocessor Programmer's Reference 
Manual (Intel order number 240329); the i860 micro- 
processor instructions for graphics are detailed in sec- 
tion 6.6. 



1.0 3-D RENDERING 

This series of examples are routines that might be used 
at the lowest level of a graphics software system to con- 
vert a machine-independent description of a 3-D image 
into values for the frame buffer of a color video display. 
Typically, higher-level graphics routines represent an 
object as a set of polygons that together roughly de- 
scribe the surfaces of the objects to be displayed. The 
graphics system maintains a database that describes 



these polygons in terms of their colors, properties of 
reflectance or translucence, and the locations in 3-D 
space of their vertices. Due to the roughness of the 
representation, the amount of information in the data- 
base is considerably less than that which must be deliv- 
ered to the video display. A rendering procedure, such 
as Example 7, uses interpolation to derive the detailed 
information needed for each pixel in the graphics frame 
buffer. The rendering procedure also performs pixel-by- 
pixel hidden-surface elimination. 

The focus of this series of examples is Example 7, 
which operates on a segment of a scan line. The seg- 
ment is bounded by two points of given location and 
color: from point (XI, Y0, Zl) with color intensities 
Recti, Grnl, Blul to point (X2, Y0, Z2) with color in- 
tensities Red2, Grn2, Blu2. The points and color inten- 
sities are determined by higher-level graphics software. 
The points represent the intersection of the scan line 
with two edges of the projected image of a polygon. For 
a given scan line, the rendering procedure is executed 
once for each polygon that projects onto that scan line. 
The higher-level graphics software is responsible for 
orienting the objects with respect to the viewer, for 
making perspective calculations, for scaling, and for de- 
termining the amount of light that falls on each poly- 
gon vertex. 

The 16-bit pixel format is used, giving ample resolution 
for color shading: 2 6 intensity values for red, 2 6 intensi- 
ty values for green, and 2 4 intensity values for blue. 
Example 1 shows how to set the pixel size. For hidden- 
surface elimination, the Z-buffer (or depth buffer) tech- 
nique is employed, each Z value having a resolution of 
16-bits. 

Because the examples presented here use almost all of 
the registers of the i860 microprocessor, the registers 
are given symbolic names, as defined by Example 2. In 
a real application, it is likely that some of the inputs to 
the rendering procedure would be passed in floating- 
point registers instead of the integer registers employed 
here. The register allocation shown in Example 2 sim- 
plifies the examples by avoiding the need to use any 
register for multiple purposes. 




// SET PIXEL 


SIZE TO 16 


ld.c 


psr, Ra // Work on psr 


andnoth 


OxOOCO, Ra, Ra// Clear PS 


orh 


0x0040, Ra, Ra// PS = 16-bit pixels 


st.c 


Ra, psr // 



Example 1. Setting Pixel Size 
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// 


REGISTER DEFINITIONS FOR RENDERING PROCEDURE 


// 




INTEGER 


LOCALS 




Ra 


= r4 


// 


Temporary 




Rb 


= r5 


// 


Temporary 




Re 


■= r6 


// 


Temporary 




Rd 


= r7 


// 


Temporary 


// 




INTEGER 


INPUTS 




XI 


= rl6 


// 


X coordinate of starting point of line segment in pixels 




dX 


= rl7 


// 


Width of scan line segment in number of pixels 




ZBP 


= rl8 


// 


Z-buffer pointer to the current line segment 




Zl 


= rl9 


// 


Initial Z value, fixed-point 16.16 format 




mZ 


= r20 


// 


Z slope, fixed-point 16.16 format 




FBP 


= r21 


// 


Graphics frame buffer pointer to the current line segment 




Redl 


= r22 


// 


Initial red intensity, fixed-point 6.10 format, plus .5 




Grnl 


= r23 


// 


Initial green intensity, fixed-point 6.10 format, plus .5 




Blul 


= r24 


// 


Initial blue intensity, fixed-point 6.10 format, plus .5 




mR 


= r25 


// 


Red slope, fixed-point 6.10 format 




mG 


= r26 


// 


Green slope, fixed-point 6.10 format 




mB 


= r27 


// 


Blue slope, fixed-point 6.10 format 


.// 




REAL LOCALS 




aZ 


= f2 


// 


Accumulated Z values . 




aZh 


= f3 


// 






iZl 


= f 4 


// 


Z interpolant, coefficient 1.0 




iZlh 


= f5 


// 






iZ3 


= f6 


// 


Z interpolant, coefficient 3.0 




iZ3h 


= f7 


// 






oldz 


= f8 


// 


Original values from the Z-buffer 




newz 


= flO 


// 


New Z-buffer values 




newzh = fll 


// 






newi 


= fl2 


// 


New pixel values 




iR 


= fl4 


// 


Red interpolant, coefficient 4.0 




iRh 


= fl5 


// 






aR 


= fl6 


// 


Accumulated red intensities 




aRh 


= fl7 


// 






iG 


= fl8 


// 


Green interpolant, coefficient 4.0 




iGh 


= fl9 


// 






aG 


= f20 


// 


Accumulated green intensities 




aGh 


= f21 


// 






IB 


= f22 


II 


Blue interpolant, coefficient 4.0 




iBh 


= f23 


II 






aB 


= f24 


II 


Accumulated blue intensities 




aBh 


= f25 


II 






IZmask = 


f26 


// left-end Z mask 




IZmaskh = 


f27 


It 




rZmask = 


f28 


II right-end Z mask 




rZmaskh = 


f29 


// 



Example 2. Register Assignments 
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2.0 DISTANCE INTERPOLATION 

To perform hidden surface elimination at each pixel, 
the rendering routine first interpolates the value of Z at 
each pixel. Distance interpolation consists of calculat- 
ing the slope of Z over the given line segment, then 
increasing the Z value of each successive pixel by that 
amount, starting from XL The width of the line seg- 
ment in pixels is . . . 



dX = X2 



XI 



Calculate the reciprocal of dX\ 

RdX = 1/dX 

The value of dX is used several times as a divisor. It is 
most efficient to calculate its reciprocal once, then, in- 
stead of dividing by dX, multiply by RdX. The slope of 
Z is . . . 

mZ = (Z2 - ZiyRdX 

Because each polygon is a plane, the value of mZ is 
constant for all scan lines that intersect the polygon; 
therefore mZ needs to be calculated only once for each 



polygon. Example 7 assumes that dX and mZ have al- 
ready been calculated, and all that remains is to apply 
mZ to successive pixels. Let Z(Xn) be the Z value at 
pixel Xn. Then . . . 

Z(X1) = Zl 

Z{X1 + 1) = Zl + mZ 

Z(X1 + 2) = Zl + VmZ 



Z(X1 + N) = Zl + N*mZ 

Z(X1 + dX) =Z1 + dX*mZ = Z(X2) 

Figure 1 illustrates this Z-value interpolation. 

The faddz instruction helps to perform the above calcu- 
lations 64 bits at a time. Because a Z value is 16 bits 
wide, Example 7 operates on the Z buffer in groups of 
four. The faddz instruction, however, treats the interpo- 
lation values {N*mZ) as 32-bit fixed-point numbers; 
therefore, two faddz instructions are executed for each 
group of four pixels. Because of the way the faddz shifts 




(r.g.b.x.y.z = 4000) 



Z1 = 2400 



(r'.g'.b'.x', y'.z' = 800) 




o o o o o o 



3000-2400 
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Figure 1. Z-Buffer Interpolation 
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the MERGE register, the first faddz corresponds to 
even-numbered pixels, while the second corresponds to 
odd-numbered pixels. Instead of starting with the value 
for the first pixel (Z(X1)) and adding mZ to each pixel 
to produce the value for the next pixel, the example 
procedure starts with the values for the first two even- 
numbered pixels and adds l*mZto each of these values 
to produce the values for the adjacent odd-numbered 
pair. Adding 3*mZ to each of the Z values of an odd- 
numbered pair produces the values for the next even- 



numbered pair. Figure 2 shows one way of constructing 
the operands before starting the distance interpolations. 
(The initial value given to srcl depends on the align- 
ment of the first pixel.) Table 1 helps to visualize the 
process. 

After two faddz instructions, the MERGE register 
holds the Z values for four adjacent pixels (in the cor- 
rect order). The form instruction copies MERGE into 
one of the 64-bit floating-point registers. 
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Figure 2. faddz Operands 
Table 1. faddz Visualization 



Operands 


63-32 
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63-48 
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src2 
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3.0 
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0.0 
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src2 
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3 


2 


1 





src2 
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4.0 
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4 




src2 
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7 


6 


5 


4 
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10 




8 




src2 
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11.0 


9.0 


11 


10 


9 


8 


src2 


3.0 


3.0 




rdest/srd 


14.0 


12.0 


14 




12 




src2 


1.0 


1.0 




rdest 


15.0 


11.0 


15 


14 


13 
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Because the values of Z1 and mZ are constant for each loop through the rendering routine, the numbers shown here are 
the values of the coefficient N, where the actual operands have the values Z1 + N*mZ. For each execution of faddz, srd 
is the same as rdest of the prior faddz. After every two faddz instructions, a form instruction empties the MERGE register. 
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// CONSTRUCT 


INTERPOLANTS iZl 


AND iZ3 GIVEN mZ 


ixfr 




mZ, 


iZl 


// Join each half in 64-bit register 


shl 




1, 


mZ, 


Ra // Ra = 2*mZ 


adds 




Ra, 


mZ, 


Ra // Ra = 3*mZ 


ixfr 




Ra, 


iZ3 


// Join each half in 64-bit register 


fmov. 


ss 


iZl, 


iZlh 


// Join each half in 64-bit register 


fmov. 


ss 


iZ3, 


iZ3h 


// Join each half in 64-bit register 



Example 3. Construction of Z Interpolants 
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Figure 3. Pixel Interpolation for Gouraud Shading 



The same register is used as both srcl and rdest in all 
faddz instructions. This register serves to accumulate Z 
values for successive pixels; therefore, it is called an 
accumulator. The registers used as src2 are called inter- 
polants. The code in Example 3 constructs the interpo- 
lants; it needs to be executed only once for each poly- 
gon. 



3.0 COLOR INTERPOLATION 

To determine the RGB color intensities at each pixel, 
the rendering routine interpolates between the color in- 
tensities at the end points. (This rendering technique is 
called "Gouraud shading" after H. Gouraud, "Contin- 
uous Shading of Curved Sufaces," IEEE Transactions 
on Computers, C-20(6), June 1971, pp. 623-628.) Let 
the symbol C (color) represent either R (red), G 
(green), or B (blue). Color interpolation consists of cal- 
culating the slope of C over the given line segment, then 
increasing the C values of each successive pixel by that 
amount, starting from the values for XL This must be 
done for C = R, C = G, and C = B. The slope of C is . . . 

mC = (C2 - ClYRdX 

. . . where RdX = 1/dX 



The value of mC is constant for all scan lines that inter- 
sect a given pair of polygon edges; therefore mC needs 
to be calculated only once for each such pair. Example 
7 assumes that mC has already been calculated for all 
colors, and all that remains is to apply mC to successive 
pixels. Let C(Xn) be a C value at pixel Xn. Then . . . 

C(X1) = CI 

C(X1 + 1) '= CI + mC 

C(X1 + 2) = CI + 2*mC 



C(X1 + TV) = CI + N*mC 

C{X1 + dX) = CI + dX*mC = C(X2) 

Figure 3 illustrates Gouraud shading of a triangle. 

The faddp instruction performs the above calculations 
64 bits at a time. Because a pixel is 16 bits wide, Exam- 
ple 7 operates on pixels in groups of four. Instead of 
starting with the value for the first pixel (C(X1)) and 
adding mC to each pixel to produce the value for the 
next pixel, the example procedure starts with the values 
for the first four pixels and adds 4*raC to each group of 
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four to produce the values for the next four. Three 
faddp instructions are executed for each group of four 
pixels. The first increments the blue values; the second, 
green; the third, red. Figure 4 shows one way of con- 
structing the operands for each color before starting the 
color interpolations. (The initial value given to srcl de- 
pends on the alignment of the first pixel.) 

Setup of the accumulator and interpolants is similar to 
that of the Z-buffer. The code in Example 4 constructs 
the interpolants; it needs to be executed only once for 
each pair of edges in each polygon. 



4.0 BOUNDARY CONDITIONS 

The i860 microprocessor operates on 64-bit quantities 
that are aligned on 8-byte boundaries. The code in this 
example takes full advantage of this design, handling 
four 16-bit pixels in each loop. However, if the first or 



last pixel of a line segment is not on an 8-byte bounda- 
ry, two kinds of special considerations are required: 

1. Masking of Z values near the end points. 

2. Initialization of the accumulators. 

4.1 Z-Buffer Masking 

When either the first or last pixel of the line segment is 
not at an 8-byte boundary, the rendering procedure 
must mask the first or last set of new Z-buffer values 
(newz) so that the Z-buffer and the frame buffer are not 
erroneously updated. Sometimes both the first and last 
pixels are in the same 4-pixel set, in which case either 
one may not be on an 8-byte boundary. A function that 
looks up and calculates masks is shown in Example 5. 

Because the value OxFFFF is used for masking, the Z- 
buffer is initialized with OxFFFE, so that the fzchks 
instruction always finds the mask to be greater than 
any Z-buffer contents. 
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Figure 4. faddp Operands 



// CONSTRUCT 


INTERPOLANTS iR, 


iG, iB GIVEN mR, mG, mB 


shl 




18, 


mR, 


Ra 


// Multiply each color slope by four, then 


shl 




18, 


mG, 


Rb 


// 


shift by 16 to put the significant 


shl 




18, 


mB, 


Re 


// 


bits into the high-order half 


shr 




16, 


Ra, 


mR 


// 


Return significant 16 bits 


shr 




16, 


Rb, 


mG 


// 


to low-order half. Any sign bits 


shr 




16, 


Re, 


mB 


// 


in high-order half are gone. 


or 




mR, 


Ra, 


Ra 


// 


Join 16-bit quarters 


or 




rG, 


Rb, 


Rb 


// 


in 32-bit register 


or 




mB, 


Re, 


Re 


// 




ixfr 




Ra, 


iR 




// 


Join 32-bit halves 


ixfr 




Rb, 


iG 




// 


in 64-bit register 


ixfr 




Re, 


IB 




// 




fmov. 


ss 


iR, 


iRh 




// 




fmov. 


ss 


iG, 


iGh 




// 




fmov. 


ss 


iB, 


iBh 




// 





Example 4. Construction of Color Interpolants 
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.macro zmask l_align, realign, Rx, 


Ry 


// l.align 


i, realign - left- and right-end alignment [0..3] in 2-byte units 


// Rx, Ry 


- scratch registers 


.data 






.aligr 


i 8 




left_mask: 


: //low high 




.long 


0x00000000, 0x00000000 


// mod 4 


.long 


OxOOOOFFFF, 0x00000000 


// 1 mod 4 


.long 


OxFFFFFFFF, 0x00000000 


// 2 mod 4 


.long 


OxFFFFFFFF, OxOOOOFFFF 


// 3 mod 4 


right_masl< 


:://low high 




.long 


OxFFFFOOOO, OxFFFFFFFF 


// mod 4 


.long 


0x00000000, OxFFFFFFFF 


// 1 mod 4 


.long 


0x00000000, OxFFFFOOOO 


// 2 mod 4 


.long 


0x00000000, 0x00000000 


// 3 mod 4 


.text 






shl 


3, l_align, l_align 


// Multiply by 8 


mov 


left_mask, Rx 


// 


fld.d 


l_align (Rx) , IZmask 


// Load 8-byte mask 


shl 


3, realign, r_align 


// Multiply by 8 


mov 


right_mask, Rx 


// 


fld.d 


r_align (Rx) , rZmask 


// Load 8-byte mask 


// If the 


first and last pixels are 


contained in the same 64-bit 


// aligned 


set, then IZmask = IZmask OR rZmask. 


andh 


0x8000, dX, rO 


// Is dX negative 


be 


L2 


// If not, right end is in other set 


fxfr 


IZmask, Rx 


// 


fxfr 


rZmask, Ry 


// 


or 


Rx, Ry, Rx 


// OR low-order half 


ixfr 


Rx, IZmask 


// 


fxfr 


IZmaskh, Rx 


// 


fxfr 


rZmaskh, Ry 


// 


or 


Rx, Ry, Rx 


// OR high-order half 


ixfr 


Rx, IZmaskh 


// 


L2: nop 




// 


• endm 








Example 5. Z Mask Procedure 
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Table 2. Accumulator Initial Values 






Alignment 


Initial Z Accumulator Values 







Z1 - 


1*mZ 


Z1 - 


3*mZ 




2 




Z1 - 


2*mZ 


Z1 - 


4*mZ 




4 




Z1 - 


3*mZ 


Z1 - 


5*mZ 




6 




Z1 - 


4*mZ 


Z1 - 


6*mZ 




Alignment 


Initial Color Accumulator Values 
C = R,G,B 





C1 


- 1*mC 


C1 - 2*mC 


C1 -3*mC 


CI 


-4*mC 


2 


C1 


-2*mC 


C1 - 3*mC 


C1 - 4*mC 


C1 


-5*mC 


4 


C1 


- 3*mC 


C1 - 4*mC 


C1 - 5*mC 


C1 


-6*mC 


6 


C1 


- 4*mC 


C1 - 5*mC 


C1 - 6*mC 


CI 


-7*mC 



Table 3. Accumulator Initialization Table 



Alignment 


Table Values 


*mZ 


*mR 


*mG 


*mB 



2 
4 
6 


-1, -3 
-2, -4 
-3, -5 
-4,-6 


-1, -2, -3, -4 
-2, -3, -4, -5 
-3, -4, -5, -6 
-4, -5, -6, -7 


-1, -2, -3, -4 
-2, -3,-4,-5 
-3,-4,-5,-6 
-4,-5, -6, -7 


-1, -2,-3, -4 
-2, -3,-4, -5. 
-3, -4, -5, -6 
-4, -5, -6, -7 



4.2 Accumulator Initialization 

When the first pixel of the line segment is not at an 8- 
byte boundary, initial values placed in the accumulators 
(aZ> aB, aG, and aR) must be selected so that Z7, 
Redl, Gml, and Blul correspond to the correct pixel. 
The desired result is that shown by Table 2. However, 
each value is a composite of two terms: one that is 
constant for each edge pair (n*mZ, n*mR, n*mG, 
n *mB) and one that can vary with each scan line (Zl, 
Redl, Grnl, Blul). The example assumes that the con- 
stant values have all been calculated and stored in a 
memory table of the format shown by Table 3. At the 
beginning of each line segment the values appropriate 
to the alignment of the line segment are retrieved from 
the table and added to the initial Z and color values, as 
shown in Example 6. 



5.0 THE INNER LOOP 

Once the proper preparations have been made, only a 
minimal amount of code is needed to render each scan- 



line segment of a polygon. The code shown in Example 
7 operates on four pixels in each loop. The left and 
jight ends of the line segment go through different logic 
paths so that the Z-buffer masks can be applied by the 
form instruction. All the interior points are handled by 
the tight inner loop. 

The controlling variable dX is zero-relative and is ex- 
pressed as a number of pixels. The value of dX also 
indicates alignment of the end-points with respect to 
the 4-pixel groups. Unaligned left-end pixels are sub- 
tracted from dX before entering the inner loop; there- 
fore, subsequent values of dX indicate the alignment of 
the right end. A value that is 3 mod 4 indicates that the 
right end is aligned, which explains the test for a value 
of — 5 near the end of the loop ( — 5 mod 4 = 3). The 
fact that the value —5 is loaded into register Rb on 
every execution of the loop does not represent a pro- 
gramming inefficiency, because there is nothing else for 
the core unit to do at that point anyway. 
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// ACCUMULATOR 


INITIALIZATION 


TABLE 








.data; .align .double 










acc_init_tab: : 


.double [16] 


■ 








.dsect 














aBi : .double 


II 


Four initial 


16 


-bit 


blue values 


aGi : .double 


II 


Four initial 


16 


-bit 


green values 


aRi: .double 


II 


Four initial 


16 


-bit 


red values 


aZi : .double 


II 


Two initial 32-1 


bit 


Z values 


.end 














.text 














// INITIALIZE ACCUMULATORS 










.macro acc_init 


Lalign, Rtab, 


Rx, Ry, 


Fx, 


Fxh 


// Lalign - left-end 


alignment (0. 


.3) 


in 


two-byte units 


// Rtab - register 


to use for addressing the table 


// Rx, Ry, Fx, 


Fxh - 


scratch 


registers 




mov 


acc_init_tab, 


Rtab 




// 




shl, 


5, 


Lalign 


Lalign 


// Multiply by row width 


adds 


Lalign 


, Rtab, 


Rtab 




// 


Index row corresponding to alignment 


fld.d 


aZi(Rtab), 


aZ 




// 


Z 


ixfr 


zi, 


Fx 






// 


z 


fld.d 


aRi (Rtab) , 


aR 




// 


R-Load constant values 


shl 


16, 


Redl, 


Rx 




// 


R- Shift starting value to hi-order 


fmov.ss 


Fx, 


Fxh 






II 


Z 


shr 


16, 


Rx, 


Ry 




II 


R-Redl stripped of sign bits 


fiadd.dd 


Fx, 


aZ, 


aZ 




II 


Z 


or 


Rx, 


Ry, 


Ry 




II 


R-Form (Redl, Redl) 


ixfr 


Ry, 


Fx 






II 


R-Put in 64-bit register 


fld.d 


aGi (Rtab) , 


aG 




II 


G 


shl 


16, 


Grnl, 


Rx 




II 


G 


fmov.ss 


Fx, 


Fxh 






II 


R-Form (Redl, Redl, Redl, Redl) 


shr 


16, 


Rx, 


Ry 




II 


G 


fiadd.dd 


Fx, 


aR, 


aR 




II 


R-Add variables to constants 


or 


Rx, 


Ry, 


Ry 




II 


G 


ixfr 


Ry, 


Fx 






II 


G 


fld.d 


aBi (Rtab) , 


aB 




II 


B 


shl 


16, 


Blul, 


Rx 




II 


B 


fmov.ss 


Fx, 


Fxh 






II 


G 


shr 


16, 


Rx, 


Ry 




II 


B 


fiadd.dd 


Fx, 


aG, 


aG 




II 


G 


or 


Rx, 


Ry, 


Ry 




II 


B 


ixfr 


Ry, 


Fx 






II 


B 


fmov.ss 


Fx, 


Fxh 






II 


B 


fiadd.dd 


Fx, 


aB, 


aB 




II 


B 


.endm 
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II 


RENDERING PROCEDURE 










II 


16-bit 


pixels, 


16-bit 


Z-buffer 






and 


3, 


XI, 


Ra 


// 


Determine alignment of starting-point 




acc_init Ra, Rb, Re, Rd, 


Fa, Fah 


//Initialize accumulators 




subs 


4, 


Ra, 


Rb 


// 


4 - alignment 




subs 


dX, 


Rb, 


dX 


// Adjust dX by XI alignment 




// If dX < 


= 0, then right 


end is in same set as left end 




and 


3, 


dX, 


Rb 


// 


Determine alignment of right end 




zmask 


Ra, Rb, 


Re, Rd 




// 


Prepare both left- and right-end masks 


left_end:: // 


Handle b 


oundary 


cond 


itions 




d.faddz 


aZ, 


iZ3, 


aZ 


// 


Interpolate 2 even Z values 




adds 


-8, 


FBP, 


FBP 


// 


Anticipate autoincrement 




d.faddz 


aZ, 


iZl, 


aZ 


// 


Interpolate 2 odd Z values 




adds 


-8, 


ZBP, 


ZBP 


// 


Anticipate autoincrement 




d.form 


IZmask, 


newz 




// 


Mask 4 new Z values 




fld.d 


8 (ZBP) , 


oldz 




// 


Fetch 4 old Z values 




d . f addp 


aB, 


iB, 


aB 


// 


Interpolate 4 blue intensities 




mov 


-4, 


Ra 




// 


Loop increment: 4 pixels 




d.faddp 


aG, 


iG, 


aG 


// 


Interpolate 4 green intensities 




adds 


-4, 


dX, 


dX 


// 


Prepare dX for bla at end of loop 




d.faddp 


aR, 


iR, 


aR 


// 


Interpolate 4 red intensities 




bla 


Ra, 


dX, 


LI 


// 


Initialize LCC 




d.form 


fO, 


newi 




// 


Move 4 new pixels to 64-bit reg 




adds 


5 ' 


dX, 


rO 


// 


Are there any whole sets (dX < -5)? 


LI: 


d.fzchks 


oldz, 


newz , 


newz// 


Mark closer points in PM[7..4] 




be 


short_segment 




II 


Get out now if no whole set 




d.fnop 








II 






fld.d 


16(ZBP) 


, oldz 


II 


Fetch 4 old Z values 


inner_loop:: // Handle 


all interior 


points 




d.faddz 


aZ, 


iZ3, 


aZ 


// 


Interpolate 2 even Z values 




nop 








// 






d.faddz 


aZ, 


iZl, 


aZ 


// 


Interpolate 2 odd Z values 




fst.d 


newz, 


8 (ZBP) 


++ 


// Update Z buf from, prior loop 




d.form 


fO, 


newz 




// 


Move 4 new Z values to 64-bit reg 




nop 








// 






d.fzchks 


fO, 


fO, 


fO 


'// 


Shift PM[7..4] to PM[3..0] 




mov 


-5, 


Rb 




// 


-5 mod 4 = 3, aligned right end 




d.faddp 


aB, 


iB, 


aB 


// 


Interpolate 4 blue intensities 




pst.d 


newi, 


8 (FBP) 


++ 


// 


Store pixels indicated by PM[3..0] 




d.faddp 


aG, 


iG, 


aG 


// 


Interpolate 4 green intensities 




xor 


Rb, 


dX, 


rO 


// 


Are we at an aligned right end? 




d.faddp 


aR, 


iR, 


aR 


// 


Interpolate 4 red intensities 




be 


aligned 


_end 




// 


Taken if at an aligned right end 




d.form 


fO, 


newi 




II 


Move 4 new pixels to 64-bit reg 




bla 


Ra, dX, 


inner_ 


loop 


// 


Loop if not at end of line segment 




d.fzchks 


oldz, 


newz, 


newz// 


Mark closer points in PM[7..4] 




fld.d 


16(ZBP) 


, oldz 


// 


Fetch 4 old Z values for next loop 


II 


End of inner_loop. 


Right end not aligned 
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right_end:: // 


Handle boundary 


conditions 


d.faddz 


aZ, 


iZ3, 


aZ 


// Interpolate 2 even Z values 


nop 








// 


d.faddz 


aZ, 


iZl, 


aZ 


// Interpolate 2 odd Z values 


fst.d 


newz, 


8(ZBP)- 


f+ 


// Update Z buf from prior loop 


d.form 


rZmask, 


newz 




// Mask 4 new Z values 


nop 








// 


d.fzchks 


fO, 


fO, 


fO 


// Shift PM[7..4] to PM[3..0] 


nop 








// 


d.faddp 


aB, 


iB, 


aB 


// Interpolate 4 blue intensities 


pst.d 


newi, 


8(FBP)- 


f+ 


// Store pixels indicated by PM[3..0] 


d.faddp 


aG, 


iG, 


aG 


// Interpolate 4 green intensities 


nop 








// 


d.faddp 


aR, 


iR, 


aR 


// Interpolate 4 red intensities 


nop 








// 


aligned_end: : 


// No special boundary conditions 


d.form 


fO, 


newi 




// Move 4 new pixels to 64-bit reg 


br 


wrap_up 






// 


d.fzchks 


oldz, 


newz, 


newz// Mark closer points in PM[7. .4] 


nop 








// 


short_segment : 










d.fnop 








// 


adds 


8, 


dX, 


rO 


// Is right end in same set as left? 


d.fnop 








II 


bnc.t 


right_.end 




II Branch taken if no. 


d.fnop 








// 


fld.d 


16(ZBP) 


, oldz 


// Fetch 4 old Z values 


wrap_up:: // Store the 


unstored and 


leave dual mode. 


fzchks 


fO, 


fO, 


fO 


// Shift PM[7..4] to PM[3..0] 


fst.d 


newz, 


8(ZBP) 


f+ 


// Update Z buf from prior loop 


fnop 










pst.d 


newi, 


8(FBP)- 


f + 


// Store pixels indicated by PM[3..0] 
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6.0 ALTERNATIVE IMPLEMENTATIONS 

Example 8 contrasts the inner loop of the 16-bit pixel rendering procedure with that of an 8-bit procedure. For 8-bit 
pixels, two faddp instructions accomplish 64-bits of pixel intensity interpolation; there is no need to maintain three 
separate color accumulators. Four faddz instructions (rather than two) are required, because eight Z values are 
created for the eight pixels per loop. 



// 8-bit 


Pixels, 16-Bit 


Zbuffer = 8 Pixels 


in 15 Clocks 


// G- 


Unit 




1 


Core Unit 


inner.loop:: 












d. faddz 


aZ,deltaZl,aZ 






fld.q 


16(ZBP) ,oldZ_A 


d. faddz 


aZ,deltaZ2,aZ 






nop 




d.form 


fO,newZ_A 






nop 




d. faddz 


aZ,deltaZl,aZ 






andh 


0x8000, dX, r0 


d.faddzz 


aZ,deltaZ2,aZ 






bnc 


rightend 


d.form 


fO,newZ_B 






nop 




d.fzchks 


oldZ_A,newZ_A, 


newZ. 


-A 


nop 




d.fzchks 


oldZ_B,newZ_B, 


newZ. 


_B 


nop 




d. faddp 


intens,dl,intens 




fst.q 


newZ_A ,16(ZBP)++ 


d. faddp 


intens,dI2,intens 




bte 


0,dX,end 


d.form 


f0,newi 






bla 


neg8 , dX, inner_loop 


d.fnop 








pst.d 


newi,8(FBP)++ 


// 













// 16-Bit Pixels, 16-Bit Zbuffer 
// G-Unit ' | 

inner_loop:: 

d. faddz aZ,iz3,aZ 

d. faddz aZ,izl,aZ 

d.form f0,newz 

d.fzchks f0,f0,f0 

d.faddpi aB,iB,aB 

d. faddp aG,iG,aG 

d. faddp aR,iR,aR 

d.form f0,newi 

d.fzchks oldz,newz,newz 

// 



4 Pixels in 10 Clocks 
Core Unit 



nop 

fst.d 

nop 

mov 

pst.d 

xor 

be 

bla 

fld.d 



newz,8(ZBP)++ 

-5,Rb 

newi,8(FBP)++ 

Rb,dX,rO 

aligned_end 

neg4,dX,inner_loop 

16(ZBP) ,oldz 
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ABSTRACT 

The i860 Processor computes floating-point results rap- 
idly, lending itself to DSP (digital signal processing) as 
well as general-purpose computing. With this high per- 
formance, DSP functions can be added to any system 
containing an i860 CPU. A Fast Fourier Transform 
(FFT) illustrates this DSP power. Complete code for 
the FFT is presented in this application note, as well as 
performance measurements. Both complex and real in- 
put data FFTs are included, as well as both Decimation 
in Time and Decimation in Frequency. 



1.0 INTRODUCTION TO FAST 
FOURIER TRANSFORMS 

Discrete Fourier Transforms (DFTs) change time-do- 
main data samples into a frequency-domain profile of 
the sampled signal. The frequency-domain representa- 
tion consists of the magnitudes of sine waves at various 
frequencies, which would recreate the original data if 
superimposed. To accomplish the transform, a DFT 
adds combinations of the input data samples, after mul- 
tiplying some of those inputs with weighting factors. 
The number of samples, "N", is usually a power of two. 

Each result in the frequency domain comes from a 
weighted sum of all data samples. The weighting ("W") 
factors are called "twiddles", and are complex cosine/ 
sine values for each particular frequency. 

The FFT (Fast Fourier Transform) is an efficient im- 
plementation of the DFT, defined by: 

x(n) = time domain samples of the signal, 
n = 0, 1, ... N-l 

X(k) = the Discrete Fourier Transform of x(n), k = 
0,1, . . . N- 1 

= a "frequency domain" equivalent of x(n) 
= 2 x(n) * Wnk n = to N-l, and 
W"k = e-J2™k/N , where j = ^PT 

= 2 x(n) * (cos(27rnk/N) - j * sin(27rnk/N)) 

The (N-l) complex adds and (N-l) complex multiplica- 
tions required for each X(k) make the DFT an Order 
(N 2 ) computation. Fortunately, the FFT decomposes 
this to an Order (N * log2 N) algorithm by splitting the 
N-sum into units of 2-sums. These units are called 
"butterflies" because they produce 2 output values 
from 2 inputs, with the butterfly-shaped dataflow 
shown below. (Some FFT algorithms, called Radix-4, 
use 4-input, 4-output butterflies.) The butterfly calcula- 
tions are executed in stages, with log2 N stages and N/2 
butterflies per stage. 



The subdivision, or decimation, of the N-sum into but- 
terflies can be done via two different methods: "Deci- 
mation in Time" (DIT) or "Decimation in Frequency" 
(DIF). The methods differ in the ordering of twiddles 
and the form of the butterfly arithmetic, but they yield 
the same answer. They are based on different mathe- 
matical derivations of the FFT: DIT results from recur- 
sively splitting the input time-domain samples into an 
even-indexed group and an odd-indexed, while DIF 
comes from splitting the DFT output frequency-do- 
main points into odd/even groups. 



2.0 BUTTERFLY DEFINED 

Let A = the first input to the butterfly (complex 
number, composed of Real part AR and 
Imaginary part AI) 

B = the second input to the butterfly (com- 
plex, BR and BI) 

W = twiddle factor (also complex, WR and 
WI) 

Anew = complex result #1, which overwrites A 

Bnew = result #2, which overwrites B 

For a "Decimation-in-Frequency" butterfly, 
Anew = A + B 
Bnew= (A - B) * W 

The complex add, subtract, and multiply of a butterfly 
decompose into 4 real multiplies, 3 real adds, and 3 real 
subtracts: 

AnewR = AR + BR tempR = AR-BR 

Anewl = AI + BI tempi = AI-BI 

BnewR = (tempR * WR) - (tempi * WI) 
Bnewl = (tempR * WI) + (tempi * WR) 

For a "Decimation-in-Time" butterfly, 
Anew = A + (B * W) 
Bnew = A - (B * W) 

The number of real operations remains 4 multiplies and 
6 add/subtracts, but the equations differ and the multi- 
plies must be done first: 

tempR = (WR * BR) - (WI * BI) 

tempi = (WR * BI) + (WI * BR) 

AnewR = AR +' tempR BnewR = AR-tempR 

Anewl = AI + tempi Bnewl = Al-tempI 
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Butterfly Dataflow: 



(Decimation in Frequency) 



A NEW = A+B 



(Decimation in Time) 




Bnew=(a-b)«w 




A NEW = A + (B»W) 



Bnew=a-(b»w) 
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The stages, twiddles, and butterflies for 8-point FFTs stages. Refer to a text on Digital Signal Processing for a 

are shown in Figures 1 and 2. For larger values of N, complete discussion of FFT design, such as chapter 6 of 

the dataflow patterns are very similar, with N/2 butter- Theory and Application of Digital Signal Processing (see 

flies executed at each stage, and a greater number of the Bibliography at the end of this note). 
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Figure 1. Decimation-ln-Frequency FFT for 8 points 
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Figure 2. Decimation-In-Time FFT for 8 points 
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3.0 BIT REVERSAL 

Due to their structure, FFT algorithms have the side- 
effect of scrambling the ordering of output data. For 
radix-2 FFTs, the output is in "bit-reversed" order — 
for example, the value for frequency one is NOT at 
location one in the output array, but at location N/2. 
Time to unscramble the output is often NOT included 
in FFT benchmarking, because scrambled output is fine 
for some signal-processing uses such as convolution. In 
any event, unscrambling consists of swapping the loca- 
tions of pairs of output values. Alternatively, input val- 
ues can be shuffled, as Decimation in Time usually does 
before the first stage (as shown in Figure 2). Otherwise, 
to avoid the shuffling of input in DIT, the twiddles 
must be accessed in bit-reversed order. As an example 
of bit-reversal, for 256 points the reordering involves: 

SWAP X(i) and X(j), where i = 'klmnopqr'b and j = 
'rqponmlk'b. The second index (j) contains the same 
bits as (i), but in opposite order. 



4.0 FFT IMPLEMENTATION ON THE 
i860 CPU 

Several features of the i860 CPU contribute to FFT 
performance. The floating-point multiplier and adder 
can simultaneously produce 1 product and 1 sum per 
cycle, using Dual-Operation FP instructions. To fetch 
the butterfly inputs and store outputs, Dual-Instruc- 
tion-Mode allows a memory fetch or store simultaneous 
with the multiply and add. Four floating-point numbers 
can be stored by one instruction, using the 16-byte-op- 
erand "fst.q" instruction. Likewise, 16 bytes can be 
fetched from the data cache in one fld.q op. 

The floating-point arithmetic of the i860 CPU con- 
forms to IEEE 754 format, which some DSPs fail to do. 
Shown below is code for the crucial inner loop of the 
FFT: 




/ /_„._ -_-__ 








// 




//inner_loop : 


do 2 Decimation-In-Frequency 


FFT butterflies. 


// Twelve cl 


ocks for 2 butterflies - 12 FP add/ sub, 8 multiplies, 


// 6 8-byte 


loads, 4 8-byte stores. 




// FP-op 




Core-op 


inner_loop: : 








d.r2pt.ss 


WR,DI,BnewR 


pfld.d 


wind (wstart) ,WRo 


d.pfsub.ss 


AR,BR,AnewRo 


fid. a 


8 ( fetch) ++,ARo 


d.ratls2.ss 


AI,BI,AnewIo 


' fld.d 


offset (fetch), BRo 


d.i2st.ss 


WI,DR,BnewI 


fst.q 


AnewR, 16 ( store )++ 


d.ratlp2.ss 


AR,BR,DR 


k adds 


wincr, wind, wind 


d.ialp2.ss 


AI,BI,DI 


pfld.d 
adds 


wind (wstart) ,WR 
wincr, wind, wind 


// 

d.r2pt.ss 


WRo,DI,BnewRo 


d.pfsub.ss 


ARo,BRo,AnewR 


fld.d 


8 ( fetch) ++,AR 


d.ratls2.ss 


AIo,BIo,AnewI 


fld.d 


offset (fetch) ,BR 


d.i2st.ss 


WIo,DR,BnewIo 


fst.q 


BnewR, offset (store) 


d.ratlp2.ss 


ARo,BRo,DR 


bla 


decrem, count ,inner_loop 


d.ialp2.ss 
/ /_.. 


AIo,BIo,DI 


and 


wlimit, wind, wind //modulo. 


// 
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5.0 CODE DESIGN 

Refer to the inner loop above and code listings at the 

end of this application note for the discussions that fol- 
low. Refer to the "i860™ 64-bit Microprocessor Pro- 
grammer's Reference Manual" (Intel order number 
240329) for details on instructions and formats. 

The programs include both assembly and Fortran com- 
ponents. Input data can number any power of 2 from 
16 to 1024 points. The algorithms are radix-2, floating- 
point, in-place, Included in the listing are both Decima- 
tion-in-Time and Frequency, and both complex-input 
and real-input FFTs. 



5.2 Pfld 

Twiddle factors (W) are fetched with pfld (Pipelined 
Floating-Point Load), to avoid caching them. Only in 
the first stage are all the W() elements used; successive 
stages use fewer and fewer elements, which are separat- 
ed by larger and larger strides. Thus placing W() in 
cache would be inefficient. The streaming of W() from 
main memory actually yields better performance than 
caching W(), for 512 and 1024 points. With the i860 
CPU's 8-byte external data bus, a complex W() value 
can be transferred in a single bus cycle. Some FFT rou- 
tines calculate W() on the fly, rather than fetching pre- 
calculated values; however, performance decreases due 
to the added run-time calculations. 



5.1 Cache Utilization 

Because the instruction cache contains 4-Kbytes, all re- 
quired code easily fits in cache. However, a 1024-point 
complex FFT fills the 8-Kbyte data cache with the in- 
put X() array. Thus the more rarely-used twiddle W.() 
array is intentionally kept out of cache, as described in 
the "pfld" section. 

A subroutine ("fetch.ss") is used to move the input data 
array efficiently into cache for the 1024-point FFT. 
"Fetch" allows all data to be brought into cache using 
the next-near (NENE#) accesses to DRAM. Without 
that routine, getting A and B from locations separated 
by 4 Kbytes (NOT the same DRAM page) makes 
fetches and writebacks from DRAM for the first stage 
slower, and adds 30% to overall execution time. 

For larger FFTs (2048 points = 16 kB), straightfor- 
ward expansion of the present algorithm would cause 
increased cache misses. Thus a larger FFT should be 
broken into multiple FFTs of 1024 points so that all 10 
stages of each can achieve high cache hits. The algo- 
rithm becomes (assuming 2048 points, Decimation-In- 
Time): 

1) Bit-reverse the entire input array 

2) Do a 10-stage FFT on the second set of 1024 points. 
Cache hits should be high on those, since they were 
most recently accessed by the bit-reversal. 

3) Do a 10-stage FFT on the first 1024 points. Prefetch 
before the first stage to ensure cache hits. 

4) Combine the 2 separate 1024-point results with a fi- 
nal stage of butterflies, where A is offset from B by 
8 Kbytes. 



5.3 Fst.q 

Quad- word (16-byte) stores allow 4 floating-point regis- 
ter values to update the cache in one cycle. Likewise, 
fld.q (Quad Floating Point Load) transfers 4 values to 
the registers in a cycle. However, in some FFT stages, 
double-word fetches (fld.d) are used instead of fld.q; 
that allows the "background" fetch of a set of operands 
concurrent with arithmetic on the other set. For the 
same reason, the inner loop does 2 butterflies, rather 
than one. 



5.4 Bit Reversal Code 

The code for bit-reversal fetches the indices of 2 ele- 
ments to be swapped from a pre-allocated array of indi- 
ces, and swaps the data elements. Again, pfld.d keeps 
the indices out of cache, for the 1024 point case. That 
assembly version of bit-reversal is approximately 7 
times faster than the standard Fortran routine. The ar- 
ray of indices was generated by printing out the values 
generated during operation of the standard Fortran ver- 
sion; similarly, the twiddle W() values can be pre-allo- 
cated and generated using a high-level- language pro- 
gram. 



6.0 PIPELINE SCHEDULING 

The adder pipeline is 3 stages, as is the multiplier; for 
the calculation of 

BnewR = (AR - BR) * WR - (Al - Bl) * Wl 

the adder result is fed back into the multiplier, and the 
product again feeds into the adder. The adder and mul- 
tiplier pipes each advance one stage for each floating- 
point instruction issued. 
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The butterfly decomposes into 6 real add/subtracts and 
4 real multiplies. Thus the best possible performance 
would be 6 clocks per butterfly, with the multiplies to- 
tally overlapping the adds. The overlap is accomplished 
with the Dual-Operation instructions: 

r2pt (KR*src2, Treg + Mout, load KR «— srcl) 

ratls2 (KR*Aout, srcl-src2, load T <— Mout) 

i2st (Krsrc2, Treg-Mout, load KI <— srcl) 

ratlp2 (KR*Aout, srcl +src2, load T *— Mout) 

ialp2 (KPAout, srcl +src2, load KI <— srcl) 

KR, KI, and T are operand registers feeding the multi- 
plier and adder, separate from the floating-point regis- 
ter file. They permit the 4 inputs for multiply and add, 
even thought the instruction format holds only 2 regis- 
ters. "Aout" and "Mout" are adder and multiplier out- 
puts. 

The data path arrangements of some of these ops are 
illustrated in Figures 3 and 4. Fetching and storing of 
butterfly operands is overlapped with the calculations, 
using Dual Instruction Mode — the integer core op 
(such as a load or branch) and FP op are fetched simul- 
taneously from the instruction cache and executed 
simultaneously. 

Scheduling of instructions was done with a pipeline dia- 
gram, as illustrated in the comments of the code listing 



srd sr 

I 


c2 


rd 

4 


3St 

















♦ 


r 










op1 op2 




MULTIPLIER UNIT 










RESULT 










1 






? 






T J 


' 










op1 op2 




ADDER UNIT 










RESULT 






240658-4 




1 




r2pt & r2st 







of difstep.ss in the Appendix. (The comments show the 
machine state after the instruction is processed.) Begin 
by placing the desired results in the rightmost column, 
then tracing progress backwards through the adder. 
When adder inputs are products (of the multiplier), one 
product is kept in the Treg for a cycle while the other 
propogates through the multiplier final stage. Those 
products can be traced back on the multiplier pipeline, 
to determine at what instruction the multiplier inputs 
must be provided. 

For example, place the BnewR label in the "Write" 
stage of the pipe (the output of the Adder). Now 

BnewR = WR * DR - Wl * Dl 

Three instructions earlier, the adder inputs for BnewR 
must be fed to adder; those inputs are products, one of 
which comes directly from the multiplier output, and 
the other from the Treg. The multiplier output and 
Treg value must then be traced back through multiplier 
stages, requiring the following instructions: 

i2st.ss WIo,DR,Bne\vIo as the 10th op of 12, to start (T - Mout) 

ratls2.ss AIo,BIo,AnewI as the 9th instruction, to update the Treg 

ialp2.ss AI,BI,DI as the 6th op, to multiply DI * WI 

ratlp2.ss AR,BR,DR as the 5th op, to multiply DR * WR 

ratls2.ss AI.BI.AnewIo as the 3rd, to start DI into the adder 

pfsub.ss AR,BR, AnewRo ' as the 2nd, to start DR into the adder 




Figure 3. Datapath for r2pt op 
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Figure 4. Datapath for rat1p2 op 
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Some trial-and-error ordering of the desired outputs is 
needed to devise a sequence which keeps the adder 
pipeline full. An op is chosen for each slot for its ability 
to load the KR or KI register, or to initiate an adder 
operation simultaneous with the multiplies required to 
calculate BnewR and Bnewl. 

Handy hints to assist dual-operation scheduling in- 
clude: 

1) Feedback the adder result to the multiplier, or visa 
versa, whenever possible. For example, the ratlp2 
op feeds adder-out to multiplier. Thus both srcl and 
src2 fields of the instruction are available to feed the 
adder-in, and a simultaneous useful add and multi- 
ply are initiated. 

2) Freeze one of the pipes, by using a pfadd or pfmul, 
when appropriate. In the butterfly, where 6 adds are 
done for every 4 multiplies, freezing of the multipli- 
er does not degrade performance. The freeze allows 
multiplier results to be held until needed in the ad- 
der. 

3) The Treg can hold a multiplier result for several 
cycles until needed in the adder. 

4) Unroll a loop to do 2 iterations per loop. That pro- 
vides time to fetch inputs for iteration 2 while calcu- 
lating iteration 1, and store results of iteration 1 
(and fetch more inputs) while calculting iteration 2. 



7.0 PERFORMANCE MEASUREMENTS 

The code was run on an evaluation card with DRAM 
memory only, no external cache, 33.33 MHz clock, and 
5 wait-states or more for some accesses. Next-near ac- 
cesses (address falls into the same DRAM page as the 
previous access) are zero wait-state, but far accesses 
take 5 or more wait-states. The code was run under a 
virtual-memory multitasking executive. Shown below 
are measured results: 

System: 33.3 MHz 80860 with a single bank of 
static-column DRAM 







Time 


TypeofFFT 


Time 


(including 
bit-reversal) 


1 024-point-complex, DIF 


1.17 ms- 


1.33 ms 


1 024-point-real 




0.67 ms 


512-point-complex, DIF 


0.48 ms 


0.56 ms 


512-point-real 




0.33 ms 


256-point-complex, DIF 


0.22 ms 


0.26 ms 


1 024-point-complex, DIT 




1.37 ms 


512-point-complex, DIT 




0.59 ms 



7.1 Cache Fill and Writeback Time 

Measured times do not include cache-fill and write- 
back. That is, the timings measured 200,000 executions 
of the FFT using the same input array. (Performance 
figures offered by other manufacturers for DSP chips 
likewise assume that the data is already in on-chip 
RAM. Of course, the i860 CPU will do that fetching 
automatically into its data cache.) The additional time 
for cache fill and writeback were measured as: 

1 024-point-complex 0.25 ms (8 Kbytes fetched, 
8 Kbytes writeback) 

5 1 2-point-complex 0. 1 2 ms (4 Kbytes) 

To quantify the calculations in MFlops (Millions of 
FLoating-point OPerations per Second), consider that 
the 1024-point complex FFT is implemented with 
about 16,400 multiplies and 28,700 adds/subtracts. 
Thus the 1.17 ms translates to a sustained 38.5 MFlops 
rate. For 512 points, the required 20,000 Flops means 
41.6 MFlops. 

The overall FFT is about 10 times faster than the equiv- 
alent Fortran. Inner loop performance was measured at 
13 cycles for the 24 instructions, which is 6.5 cycles per 
butterfly. 



Algorithm: Radix-2 FFT, in-place. Data is IEEE 754 
single-precision floating point. Implemented in assem- 
bly-language and Fortran code. 
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8.0 CODE HIERARCHY 

Pictured below are the programs developed for the i860 CPU FFT: 
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The Fortran program ffttest.f is the highest-level pro- 
gram of those listed on the following pages. It calls two 
FFT subroutines, diff.f and fft.f, then compares their 
outputs. Fft.f is a Fortran decimation-in-time algo- 
rithm, while diff.f is the high-speed DIF routine. Diff.f 
is callable by C or Fortran applications. It in turn calls 
difstep, which is implemented in assembly code 
(difstep.ss). Difstep is called once per stage of the FFT. 
A Fortran version (difstepf.f) is shown, for comparison. 
Other assembly routines are the bit-reversal-data-move- 
ment (bitrev.ss) and prefetch ("fetch" inside bitrev.ss). 

Difstep.ss contains approximately 225 assembly in- 
structions, and bitrev.ss contains about 24. The Fortran 
diff.f compiles to about 80 instructions. 

A Decimation-in-Time version of diff.f and difstep.ss 
can be found in ditt.f and ditstep.ss. The DIT version 
performs 5-10% slower than the Decimation-in-Fre- 
quency because the DIT loop takes 7 cycles per butter- 
fly, while DIF takes 6. 

A real-input algorithm is dirr.f, which can be called 
and tested using program real.f. Dirr.f calls difstep to 
do a complex DIF FFT on N real data points, but 
treats them as N/2 complex points. Then realfix.ss is 
called by dirr.f to fix the DIF output, compensating for 
the treatment of the N real points as N/2 complex. The 
derivation of the real-fix can be found in reference 3, 
Numerical Recipes in C. 

The mixture of Fortran, C, and assembly code is ac- 
complished by passing function inputs and outputs in 
registers. Only pointers and integer values were used in 
the above code, but floating point parameters can also 
be exchanged. A calling program feeds arguments to a 
function in r 16, r 17, and higher-numbered integer reg- 
isters. The callee is permitted to destroy the contents of 
those registers, but rl:rl5 must be preserved. For more 
details on parameter-passing conventions see the i860 
64-bit Microprocessor Programmer's Reference Manual, 
Chapter 8. 



9.0 CONCLUSION 

The i860 CPU computes very Fast Fourier Transforms, 
quicker than most high-end dedicated DSP chips. Con- 
tributing to the FFT performance are the 8-kByte on- 
chip data cache and 4-kByte instruction cache. Also the 
8-byte external data bus, pfld instruction, and 16-byte 
data cache width provide sufficient bandwidth to keep 
the arithmetic units busy. Dual-Operation instructions 
and Dual-Instruction-Mode allow parallel data move- 
ment and calculations. The 33.3 MHz clock rate allows 
both an add and a multiply every 30 ns, giving a time of 
1.17 ms for a 1024-point complex FFT. A 40 MHz i860 
Microprocessor will yield a time of less than 1 mSec. 
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APPENDIX A 

PROGRAM LISTINGS 



A-2 1) diff.f: 

Fortran module to do fast Decimation-In-Frequency (DIF) Radix-2 FFT. 
A-3 2) difstep.ss: 

Assembly code which does all DIF FFT butterflies; called by diff.f. 
A- 11 3) difstepf.f: 

Fortran equivalent of difstep.ss. Included here for clarity. 
A- 13 4) bitrev.ss: 

Assembly code to do bit-reversal. 
A-17 5) ffttest.f: 

Highest-level Fortran code. Tests diff.f or ditt.f. 
A-21 6) ditt.f: 

Fortran module to do fast Decimation-In-Time (DIT) Radix-2 FFT. 
A-22 7) ditstep.ss: 

Assembly code which does all DIT FFT butterflies; called by ditt.f. 
A-30 8) dirr.f: 

Fortran module for Real-Input Decimation-In-Frequency (DIF) Radix-2 FFT. 
A-31 9) realfix.ss: 

Assembly code required by dirr.f to compensate for Real-Input. 
A-36 10) real.f: 

Highest-level Fortran code, for Real-value input. Tests dirr.f. 
A-40 11) fft.f: 

Fortran FFT algorithm. Generates "correct" answers for comparison against the other code. 

A-43 12) makefile: 

Unix V/386 version of a makefile to maintain the FFT code, using the Unix "make" program-mainte- 
nance utility. Note that this makefile uses the Unix macro preprocessor "m4" to convert symbolic names 
to register numbers. 

A-45 13) start.ss: 

Assembly code preamble for Fortran runtime. 
A-45 14) time.c: 

Dummy routine, used to install breakpoints. 
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c 

C File: diff.f 

C FFT - Decimation in Freq, radix-2, inplace, 1-dimen 

C Intel assumes no responsibility for use or misuse of this code. 

C 5/19/89: call fetch8() added for 1024-point caching. 

C 6/01/89: fetch () CRUCIAL-30% performance loss if removed 

C Inputs: 

C A= complex array of input, up to 1024 pts, single-prec float 

C M= log of number of pts 

C = (number of stages of FFT) 

C N = number of points, ie, N= 2**M = number of pts 

C W= complex array of twiddle factors, length N/2. 

C REV= if bitreversed output ok. l=must re-order output 

C 

C Outputs : 

C A= complex fft of input A 

C 

subroutine diff (a, m,N,W, REV) 

integer m,N, i, j,k, REV,wlimit 

integer offset, stage, groups, wincr, powers2 (0:10) 

complex a(n) ,w(N/2) ,temp 

data powers2 /l, 2, 4, 8, 16, 32, 64, 128,256, 512, 1024/ 
C Powers2 to avoid calls to POW, DIV 

C Twiddle factor array w(k) has (cos, -sin) of 2pi*k/N 

CC Assume the caller provides w(k) constants ALREADY initialized 

C — 

C Pre-touch data, lock into cache, for 8kByte fft: 

IF (N .gt. 513) THEN 

call fetch(a,%VAL(n)) 

ENDIF 
C 

wlimit = 8* ((N/2) - 1) 

C "DO 20" stage-loop 
DO 20 stage = l,m 

groups = powers2(stage-l) 
C groups=number of times the twiddle factors are used, ie, the number of 
C smaller DFTs the stage is split into. 

C offset gets N/2, N/4, N/8, N/16,... 

offset = powers2(m-stage) 

wincr = groups 

call difstep(a, w, groups, offset , wincr, wlimit) 
20 CONTINUE 

IF (REV .ne. 0) THEN 
cc REV .ne. means must do bit-reversal reordering of output 
call bitrev(a,%VAL(M),n) 
ENDIF 

RETURN 
END 
C 
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// 

// difstep.ss: do one stage of fft butterflies 

// DIF = Decimation in Frequency, radix-2, inplace, 1-dimension 

// (C) Copyright 1989 INTEL Corporation. 

// Inner loop developed with assistance from Tricord Systems, Inc. 

// 

// 5/18/89: 1 pm - offset_2 added, as next-to-last stage was slow 
// 5/19/89: 4 pm - fetch8() routine added, for cache miss avoidance. 
// 5/31/89: am - use fst.q (13% perf improvement of inner_loop!) 
// last_bfly added, for performance. 
// 6/02/89: am - bptr deleted. Modulo-address W (5% perf improved) 

//- 

// Intel is not responsible for use nor for misuse of this program. 

// 

// Do one entire stage (n/2 butterflies). Sample invocation: 
// call difstep(a,w, groups, offset, wincr, wlimit) 
//==^===^^^^==^==^^^^^==^^^ 

// Inputs: 

// A= complex array of input, single-prec float 

// (complex stored as 4byte real, 4byte imag contiguously) 

// W= pointer to array of twiddle factors. Assuming W(k) is 

// CMPLX(cos(2pi*k/N)) ,-sin(2pi*k/N) ) for k=0 to (N/2)-l. 

// offset = distance (except for scale-by-8byte sizeof (complex) ) between 

// the 2 input values for each butterfly. 

// Offset also is the number of butterflies done per "group". 

// groups = N/(2*offset) . The number of sub-DFTs this stage is split into. 

// wincr = distance (except for scale-by-8byte sizeof (complex) ) between 

// successive w values for successive butterflies 

// wlimit =max index, in bytes, of W table. 

// 

// Outputs: 

// A= complex radix-2 butterflied version of input. 

// 

define (astart, rl6) //input data base address 

define (wstart,rl7) //twiddle array ptr. Because w-contents depend on N, 

// we will assume the caller has initialized w() array. 

define (groups, rl8) //groups=number of sub-DFTs this stage is split into. 

define (offset, rl9) //offset (initially elements, mult by 8 to get bytes) 

// between node and its dual (the 2 numbers to butterfly, ie. A and B) 

define (wincr, r20) //increment between successive W values. Remains constant 

// within a given stage. For Decimation in Freq, wincr addressing is: 

// +8 for offset=N/2 (W0,W1,W2,W3, . . .W(n-l) ) 

// +16 offset=N/4 (WO, W2, W4, ... ) etc... 
define ( wlimit, r21) //max index, in bytes, of W table, 
define (wind, r22) //current index, in bytes, of W table, 
define (off set2,r23) //offset*2 

define (decrem,r24) //bla decrement 
define (somecount ,r25) // bla counter 

define (FEtch, r26) //pointer to 1st component of butterfly (load) 
define (STore,r27) // " " 1st component of butterfly (store) 
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// f4:f7 spare 
define (AR, fl2) 
define (AI, fl3) 
define (ARo,f 14) 
define (AIo,fl5) 
define (BR, fl6) 
define (BI, fl7) 
define (BRo,f 18) 
define(BIo,fl9) 



//element A, real component 
// " ", imag 
// extra A value, for prefetch (o="odd") 

//element B, real component 

// extra B value, for prefetch 



define (ER, f20) //A+B, real (ER = AR + BR) 

define (EI, f21) // " imag " 

define (ERo,f 22) //A+B, real, previous loop's value 

define (EIo,f 23) // " imag " 



define (FR, f24) //W*(A-B) 
define (FI, f25) // " 
define (FRo,f26)" 
define (Flo, f 27) 



real 
imag 



define (DR, f28) //Difference of A-B, real part 

define (DI, f29) // " ", imag " 

define (WR, f30) //W (twiddle factor), real part 

define (WI, f31) // " " , imag 

define (WRo,f 10) //W (twiddle factor), real part (EXTRA copy) 

define (WIo,f 11) // " " , imag 



.text 

.align .quad 
_difstep_: : 

ld.l 

ld.l 

shl 

shl 



(groups) , groups //fix Fortran call-by-ref 
O(offset) , offset // 

3, offset, off set // change from elements to bytes 
1, offset, offset2 



fst.q f8 ,-16(sp)++ //save "local" regs 
fst.q fl2,-16(sp)++ // " " 

adds -1, groups, groups // pre-decrement for bnc usage, or bla usage 
adds -16,r0,decrem //bla decrement 

// We code the last 2 stages as special cases: 

// 

xor 8, offset, r0 //offset=l, special case, no complex mult, funny addressing 
bcoffset_l// (ASSUMING offset=l means wincr=0, and no twiddle used) 
xor 16, offset, rO //offset=2, special case, no complex mult, funny addressing 
bcoffset_2// (ASSUMING offset=2 means wincr=N/4) 

// 

ld.l O(wincr) ,wincr 
ld.l O(wlimit) ,wlimit 
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pfadd.ss fO,fO,fO 

pfadd.ss fO,fO,fO 

pfadd.ss fO,fO,fO // init A1,A2,A3=0 

pfmul.ss fO,fO,fO 

pfmul.ss fO,fO,fO 

pfmul.ss fO,fO,fO 

//— 

// init pointers: 

shl 3,wincr,wincr //scale for bytes. 

shl 1, wincr, wind //init wind =2*wincr 

pfld.d ( wstart) ,fO 

pfld.d wincr ( wstart) ,fO 

adds -8,astart,FEtch 

pfld.d wind (wstart), fO 

adds wincr, wind, wind //wind now 3*wincr 
// here fetch first set of A,B,W before bla-loop 

pfld.d wind (wstart) ,WR 

adds wincr, wind, wind 

and wlimit, wind, wind //modulo-wlimit the w index 
// We do modulo-addressing on W( ) , to keep the pfld pipeline full. We 
// never do a W-fetch beyond the end of the table. 

// And the modulo-check needs to be done only every 4th pfld, as always 
// we use a multiple of 4 W() factors. 



fld.d 8 (FEtch)++,AR 
fld.d offset (FEtch),BR 
d.r2apl.ss fO,fO,fO //clear Treg. 
adds -32,offset,somecount // bla counter (predecrement by 4 elements) 

// 

// Definitions for pipe diagram: 

// (the complex multiply product, F, broken into 4 real mult and 2 adds) : 

// WR = cos'O , WI=-sin() . 

// DR = AR - BR; (diffence of Real components of A,B) 

// DI = AI - BI ; (diffence of Imag components) 

// ER = AR + BR ; EI = AI + BI ; 

// FR = K - L; where K= WR*DR, L=WI*DI 

// FI = N + M; where M= WI*DR, N=WR*DI 

// For 1st time thru inner_loop, don't have correct values to store. 
// Must do 1 loop before the loop, sans the stores. 

first_bfly:: //fill pipe 

// KR...KI.. .Ml.. ..M2....M3 T AI. ...A2. .. .A3. .. .Write 
d.r2pt.ss WR,fO,fO // WRO -, 

pfld.d wind (wstart) ,WRo 
d.pfsub.ss AR,BR,fO // - DRO - 

fld.d 8 (FEtch)++,ARo 
d.ratls2.ss AI,BI,fO // - - - - DIO DRO 

fld.d offset (FEtch),BRo 
d.i2st.ss WI,fO,fO // WIO - - - DIO DRO 

adds wincr, wind, wind 
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d.ratlp2.ss AR,BR,DR // 

nop 
d.ialp2.ss AI,BI,DI // 

pfld.d wind (wstart) ,WR 
d.r2pt.ss WRo,DI,fO // WR1 

fld.d 8 (FEtch)++,AR 
d.pfsub.ss ARo,BRo,ER // 

fld.d offset (FEtch),BR 
d.ratls2.ss AIo,BIo,EI // 

adds wincr, wind, v/ind 
d.i2st.ss WIo,DR,fO // 

and wlimit , wind, wind 





KO 


- 


- 


- 


ERO 


- 


DIO 


DRO 




LO 


KO 


- 




EIO 


ERO 


- 


DIO 


- 


NO 


LO 


KO 


- 


- 


EIO 


ERO 


- 




NO 


LO 


KO 


- 


DR1 


- 


EIO 


ERO 




- 


NO 


LO 


KO 


DI1 


DR1 


- 


EIO 


WI1 


MO 


_ 


NO 


KO 


K-L 


DI1 


DR1 


_ 



quickstart : : 

d.ratlp2.ss ARo,BRo,DR // Kl MO - NO 

bla decrem,somecount ,inner_loop //init LCC 
d.ialp2.ss AIo,BIo,DI // LI Kl MO NO 



ER1 FRO DI1 



DR1 



Ell ER1 FRO DI1 



adds -16,astart ,STore // ptrs init 16 low, for fst.q instructions 
// 

// Each butterfly = 1 complx multiply, 1 complx add, 1 complx subtract 

// = 4 multiply, 

// 3 add 

// 3 subtract 

// 3 8-byte fetches (A, B, W) 

// 2 8-byte stores (A, B) 

// 

// 6 cycles per butterfly 

// 

// inner_loop: iterates "offset/2" times (eg, N/4 for stage 1, N/8 for stage2) , 
// for each group. It does 2 butterflies per iteration 



inner_loop: : 

// KR.. 

// I 

d.r2pt.ss WR,DI,FR // WR2 

pfld.d wind (wstart) ,WRo 
d.pfsub.ss AR,BR,ERo // 

fld.d 8 (FEtch)++,ARo 
d.ratls2.ss AI,BI,EIo // 

fld.d offset (FEtch) ,BRo 
d.i2st.ss WI,DR,FI // 

fst.q ER,16(STore)++ //update 
d.ratlp2.ss AR,BR,DR // 

adds wincr, wind, wind 
d.ialp2.ss AI,BI,DI // 
//no need for modulo-check ("and") here, as odd num of W's have been fetched. 

pfld.d wind (wstart) ,WR 
// 



KI...M1. 

1 1 


.M2. 
1 


,M3 
1 


T 

1 


Al.. 

1 


A2... 

1 


A3.. Write 
1 1 


1 1 

Nl 


1 
LI 


1 
Kl 


1 
NO 


1 
N+M 


1 
Ell 


1 
ERl 


FRO 


Nl 


LI 


Kl 


NO 


DR2 


FIO 


Ell 


ERl 


- 


Nl 


LI 


Kl 


DI2 


DR2 


FIO 


Ell 


WI2 Ml 


_ 


Nl 


Kl 


K-L 


DI2 


DR2 


FIO 


ER/EI/ERo/EIo 














K2 


Ml 


- 


Nl 


ER2 


FR1 


DI2 


DR2 


L2 


K2 


Ml 


Nl 


EI2 


ER2 


FR1 


DI2 
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// KR...KI...M1... 


.M2. 


...M3 


T 


Al.. 


.A2.. 


. .A3. . 


..Write 


d.r2pt.ss WRo,DI,FRo // WR3 - N2 


L2 


K2 


Nl 


N+M 


EI2 


ER2 


FR1 


adds wincr, wind, wind 
















d.pfsub.ss ARo,BRo,ER// N2 


L2 


K2 


Nl 


DR3 


FI1 


EI2 


ER2 


fld.d 8 (FEtch)++,AR 
















d.ratls2.ss AIo,BIo,EI// 


N2 


L2 


K2 


DI3 


DR3 


FI1 


EI2 


fld.d offset (FEtch),BR 
















d.i2st.ss WIo,DR,FIo// WI3 M2 


- 


N2 


K2 


K-L 


DI3 


DR3 


FI1 


fst.q FR, offset (STore) 
















//update FR/FI/FRo/FIo 
















d.ratlp2.ss ARo,BRo,DR// K3 


M2 


- 


N2 


ER3 


FR2 


DI3 


DR3 


bla decrem,somecount, inner_loop 
















d.ialp2.ss AIo,BIo,DI// L3 


K3 


M2 


N2 


EI3 


ER3 


FR2 


DI3 


and wlimit, wind, wind //modulo. 
















end_inner_loop: : //KEEP Pipelines full 
















// RE-init pointers for fetches 
















d.fiadd.ss fO,fO,fO 
















adds offset2,astart,astart //bump to 


next 


group 










//redo A, B fetches, with proper ptr. 














d.fiadd.ss fO,fO,fO 
















fld.d O(astart) ,AR //get first AR/AI 


in 


next group 










d.fiadd.ss fO,fO,fO 
















fld.d offset (astart) ,BR 
















d.fiadd.ss fO,fO,fO 
















adds 0, astart, FEtch 
















last_bfly:: //do final 2 butterflies, start 


next 


group 










// KR...KI...M1... 


.M2. 


. ..M3 


T 


Al.. 


. .A2.. 


..A3.. 


..Write 


d.r2pt.ss WR,DI,FR // WR4 - N3 


L3 


K3 


N2 


N+M 


EI3 


ER3 


FR2 


pfld.d wind (wstart) ,WRo 
















d.pfsub.ss AR,BR,ERo // N3 


L3 


K3 


N2 


DR4 


FI2 


EI3 


ER3 


fld.d 8 (FEtch) ++,ARo 
















d.ratls2.ss AI,BI,EIo// 


N3 


L3 


K3 


DI4 


DR4 


FI2 


EI3 


fld.d offset (FEtch) ,BRo 
















d.i2st.ss WI,DR,FI // WI4 M3 


- 


N3 


K3 


K-L 


DI4 


DR4 


FI2 


fst.q ER, 16 (STore )++ 
















d.ratlp2.ss AR,BR,DR // K4 


M3 


- 


N3 


ER4 


FR3 


DI4 


DR4 


adds wincr, wind, wind 
















d.ialp2.ss AI,BI,DI // L4 


K4 


M3 


N3 


EI4 


ER4 


FR3 


DI4 


pfld.d wind (wstart) ,WR 
















// 
















// KR...KI...M1... 


.M2. 


. ..M3 


T 


Al.. 


. .A2.. 


..A3.. 


..Write 


d.r2pt.ss WRo,DI,FRo // WR5 - N4 


L4 


K4 


N3 


N+M 


EI4 


ER4 


FR3 


fld.d 8 (FEtch)++,AR 
















d.pfsub.ss ARo,BRo,ER// N4 


L4 


K4 


N3 


DR5 


FI3 


EI4 


ER4 


adds -32,offset,somecount // reset bla 


counter 












d.ratls2.ss AIo,BIo,EI// 


N4 


L4 


K4 


DI5 


DR5 


FI3 


EI4 


adds wincr, wind, wind 
















d.i2st.ss WIo,DR,FIo// WI5 M4 


- 


N4 


K4 


K-L 


DI5 


DR5 


FI3 


adds -1, groups, groups 
















d . f nop 
















fld.d offset (FEtch), BR 
















d.fnop 
















bnc.t quickstart //branch on value of 


groups 












d.fnop 
















fst.q FR, offset (STore) 
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end_last_bfly: : 
d.fnop 

br endit 
fiadd.ss fO,fO,fO 

fst.q FR, offset (STore) //repeated for bnc.t untaken case 
.align .quad 
//=:==============:=================:==:======:=======:==:==:======== 



offset_l:: 

// want FEtch=0,2,4,6,8, ... elements. ASSUMING wincr=0, 

and that w=(l,0) , so that no complex mult needed, and NO W will "be fetched. 

E=A+B, F=A-B. (Per double-butterfly loop: 8 pfadd,4 dword fid, 4 fst, 

1 bla) (fld.q required, to reduce # fids to avoid pipe stalls) 

Performance = 4 cyc/bfly best case. 



// 
// 
// 
// 



//Redefine regs for fld.q, fst.q usage, when A and B adjacent: 
define (AR3,f 12) //element A, real component 
// n n , imag 
//element B, real component 



define (AI3,f 13) 
define (BR3,f 14) 
define (BI3,f 15) 
define (AR4,f 16) 
define (AI4,f 17) 
define (BR4,f 18) 
define (BI4,f 19) 



// extra A value, for prefetch 
// extra A value, for prefetch 



define (ER3, f20) //A+B, real (ER 

define (EI3, f21) // " imag n 

define(FR3, f22) //(A-B), real 

define (FI3, f23) // n imag n 



AR + BR) 



define (ER4,f 24) //A+B, real, extra copy 
define (EI4,f 25) // " imag 

define (FR4,f 26) 
define (FI4,f 27) 

//===========================:=:==:=========== 

adds -16,astart,FEtch 

fld.q 16 (FEtch)++,AR4 

adds -1, groups, somecount //bla counter (predecremented already by 1) 
//using groups=blacount on the offset_l loop, intentionally. 

adds -16, FEtch, STore 
//startup the loop: 

// // Al A2 A3i Write: 

d.pfadd.ss AR4,BR4,fO // ARn+BRn - - 

fld.q 16 (FEtch)++,AR3 
d.pfadd.ss AI4,BI4,fO // AIn+BIn ERn - - 

adds -2,r0,decrem //2 bflies per loop 
d.pfsub.ss AR4,BR4,fO // ARn-BRn EIn ERn - 

bla decrem, somecount , offsetl_loop //init LCC 
d.pfsub.ss AI4,BI4,ER4 // AIn-BIn FRn EIn ERnext 

nop 

// — // Al. A2 A3 Write: 

offsetl_loop: : 
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d.pfadd.ss AR3,BR3,EI4 // AR+BR FI- FR- 


EI- 


nop 




d.pfadd.ss AI3,BI3,FR4 // AI+BI ER FI- 


FR- 


fld.q 16 (FEtch)++,AR4 




d.pfsub.ss AR3,BR3,FI4 //AR-BR EI ER 


FI- 


fst.q ER4,16(STore)++ 




d.pfsub.ss AI3,BI3,ER3 // AI-BI FR EI 


ER 


nop 




d.pfadd.ss AR4,BR4,EI3 // AR2+BR2 FI FR 


EI 


fld.q 16 (FEtch)++,AR3 




d.pfadd.ss AI4,BI4,FR3 // AI2+BI2 ER2 FI 


FR 


nop 




d.pfsub.ss AR4,BR4,FI3 // AR2-BR2 EI2 ER2 


FI 


bla decrem, somecount, offsetl.loop 




d.pfsub.ss AI4,BI4,ER4 // AI2-BI2 FR2 EI2 


ERnext 


fst.q ER3,16(STore)++ 




// ~ — — — —————— 


end_offsetl_loop: : 




d.fiadd.ss fO,fO,fO 




br endit 




fladd.ss fO,fO,fO 




nop 




// — — — 


.align .quad 




offset_2: : 




// want FEtch=0,l;4,5;8,9 ;12,13;... elements. 




// ASSUMING wincr=N/4 ( W_addr=0, N/4,0, N/4, 0, ... . 


. Trivial W() factors. 


// USE bla loop, incrementing FEtch by 16 (2*offset). 


// Even-indexed elements identical to offset_l,W=WO, no complex mult. 


// So FReven= (AR-BR) , FIeven= (AI-BI ) . 




// Odd components have W=(0,-1). So FRodd= (AI-BI ) , FIodd=(BR-AR) . 


// Each fld.q fetches AReven,AIeven,ARodd,AIodd 




//Assume ER,EI,ERo,EIo are 4 contiguous regs. 




//Assume FR,FI,FRo,FIo are 4 contiguous regs. 




adds -16, astart, FEtch 




fld.q 16 (FEtch)++,AR 




fld.q 16 (FEtch)++,BR 




adds 0, groups, somecount //bla counter 








// ~" — — — <—— — — — — — — — - — — — • / J A J. a .... lAbi .... • A«J • . . 


...Write: 


pfadd.ss AR ,BR ,f0 // AR+BRe 


■ '■--. 


pfadd.ss AI ,BI ,f0 // AI+BIe ER 


- 


d.pfadd.ss ARo,BRo,fO // ARo+BRo EI ER 




nop 




d.pfadd.ss AIo,BIo,ER // AIo+BIo ERo EI 


ER 


nop 




d.pfsub.ss AR ,BR ,EI // AR-BRe EIo ERo 


EI,. • 


adds -l,rO,decrem //2 bflies per loop, but 


groups is half desired value. 


d.pfsub.ss AI ,BI ,ERo // AI-BIe FR EIo 


ERo 


adds -16, astart, STore 




d.pfsub.ss AIo,BIo,EIo // AIo-BIo FI FR 


EIo 


bla decrem, somecount, offset2_loop //init LCC 




d.pfsub.ss BRo,ARo,FR // BRo-ARo FRo FI 


FR 


nop 
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offset2_loop: : 






d.fnop 






fld.q 16 (FEtch)++,AR //fetch AR.AI.ARo 


,AIo 




d.fnop 






fld.q 16 (FEtch)++,BR //fetch BR,BI,BRo 


,BIo 




// // Al A2 


. .A3. . . 


...Write: 


d.pfadd.ss AR ,BR ,FI // AR+BRe Flo 


FRo 


FI 


nop 






d.pfadd.ss AI ,BI ,FRo // AI+BIe ER 


Flo 


FRo 


nop 






d.pfadd.ss ARo,BRo,FIo // ARo+BRo EI 


ER 


Flo 


fst.q ER ,16(STore)++ 






//update ER ,EI ,ERo,EIo 






d.pfadd.ss AIo,BIo,ER // AIo+BIo ERo . 


EI 


ER 


nop 






d.pfsub.ss AR ,BR ,EI // AR-BRe EIo 


ERo 


EI 


nop 






d.pfsub.ss AI ,BI ,ERo // AI-BIe FR 


EIo 


ERo 


fst.q FR ,16(STore)++ 






d.pfsub.ss AIo,BIo,EIo // AIo-BIo FI 


FR 


EIo 


bla decrem,somecount,offset2_loop 






d.pfsub.ss BRo,ARo,FR // BRo-ARo FRo 


FI 


FR 


nop 






endit : : 






// restore regs 






fiadd.ss fO,fO,fO //exit DIM 






fld.q 0(sp) ,fl2 






fiadd.ss fO,fO,fO //last DIM pair 






fld.q 16(sp),f8 






adds 32,sp,sp 






bri rl 






nop 






//___--.__-._______-__________-.-_-.____ 
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; . 

c difstepf.f: do one stage of fft (DIF) butterflies 

c (C) Copyright 1989 INTEL Corporation. ALL RIGHTS RESERVED, 

c 

c Decimation in Freq, radix-2, inplace, 1-dimen 
c 6/20/89 

c Do one entire stage (n/2 butterflies). Sample invocation: 
c call difstep(a,w, groups, offset, wincr) 

c Inputs: 

c A= complex array of input, single-prec float 

c (complex stored as 4byte real, 4byte imag contiguously) 

c W= pointer to array of twiddle factors. Assuming W(k) is 

c CMPLX(cos(2pi*k/N)) ,-sin(2pi*k/N) ) for k=0 to (N/2)-l. 

c offset = distance (in "elements") between 

c the 2 input values for each butterfly 

c groups = number of sub-DFTs this stage is split into. 

c (groups*offset*2 = N) 

c wincr = distance between successive w values for successive butterflies 

c 

c Outputs: 

c A= complex butterflied version of input. 

SUBROUTINE difstep (a, w, groups , off set , wincr) 

integer groups, offset, wincr 

integer i, j ,indexl,iplus 

complex a(groups*offset*2) ,w(groups*offset) ,wtemp, temp 

c — . — — — — -— — .-— -— — ,- 

c We implement a... 

c Special case for offset=l(last stage) : no complex multiplies, simple add 

c (Performance enhancement) 

IF (offset .eq. 1) THEN 
CVD$ NODEPCHK 

DO 8 i . .= 1, (2*groups) ,2 
iplus = i + 1 
temp = a (iplus) 
a (iplus) = a(i) - temp 
8 a(i) = a(i) + temp 

ELSE 

C 

C Special case for offset=2 (next-to-last stage) : no complex multiplies, 

cc simple add. (Performance enhancement) 

cc For half the butterflies, W=(1,0) . For the other half, W=(0,-1) 

IF (offset .eq. 2) THEN 
CVD$ NODEPCHK 

DO 90 i = l,(4*groups) ,4 
iplus = i + 2 
temp = a (iplus) 
a (iplus) = a(i) - temp 
90 a(i) = a(i) + temp 
C 2nd call to i-loop: w=cmplx(0,-l.) 
CVD$ NODEPCHK 
CVD$ NOVECTOR 

DO 92 i = 2, (4*groups) ,4 
iplus = i + 2 
temp = a(i) - a (iplus) 
a(i) = a(i) + a(iplus) 
92 a(iplus) = CMPLX(AIMAG(temp) ,-REAL(temp) ) 
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ELSE 

C 

c "DO 20" indexl-loop is "outer loop" 
CVD$ VECTOR 
CVD$ NODEPCHK 

DO 20 indexl = 1, (2*offset*groups) , (2*offset) 
J = 1 
CVD$ NODEPCHK 
CVD$ ALTCODE 

DO 10 i = indexl, (indexl+offset-1) 
iplus = i + offset 
temp = a(i) - a(iplus) 
a(i) = a(i) + a(iplus) 
a (iplus) = w(j) * temp 
10 j = j + wincr 

20 CONTINUE 

END IF 

ENDIF 

RETURN 

END 
cccccccccccccccccccccccccccccccccc 

subroutine fetch(a,n) 

integer n 

complex a(n) ,temp 
cc Kludge do-nothing prefetch. 

temp = a(l) 

RETURN 

END 
cccccccccccccccccccccccccccccccccc 

subroutine bit rev (a, dummy, n) 
C Bit-Reverse 
C Inputs: 

C A= complex array of input, single-prec float 
C dummy = %val(m). Probably unusable from Fortran. 
C N = number of input points (and output points) 

C Ouput : 

C A = original A data, but in bit-reversed order from A 





integer n,i, j ,k,ndiv2 


c- 
c 


complex a(n) ,temp 


"DO 7" loop to in-place-bit-reverse-shuffle output 




j=l 




ndiv2 = n / 2 




DO 7 i= 1, n-1 




IF (i .It. j) THEN 




temp = a(j) 




a(j) = a(i) 




a(i) = temp 




ENDIF 




k = ndiv2 


c 


"While (j .gt. k)" /*decrease j by 2**something */ 


6 


IF (j .gt. k) THEN 




j = j-k 




k = k / 2 




GOTO 6 




ENDIF 


C 


Add next lower power of 2 to j 


7 


j = j+k 




RETURN 


r._ 


END 
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// 

// bitrev.ss 

// (C) Copyright 1989 INTEL Corporation. ALL RIGHTS RESERVED. 

// 

// BIT-reversal of 8byte array elements. 

// IN PLACE. 

// (Allows arrays of 8,16,32,64,128,256,512, or 1024 elements) 

// 

// INTEL is not responsible for use nor misuse of this code. 
// 

// 8/13/89 
//======================^^ 

// Invocation: (from Fortran) 
// call bitrev(a,%VAL(m)) 

// Inputs: 

// a = rl6 = pointer to array of 8byte elements 

// m = rl7 (call by value )= base-2 log of total number of elements 

// (2**m = N) 

// Outputs: 

// a= Bit-reversed ordered version of A 

// 

// Expected best-can-do performance, and measured performances 

// approx 4*N clocks (0.06 mSec for 512 points) 

//„ 

define (astart, rl6) //initial input data base address 

define (m, rl7) 

define (logN,rl7) 

define (destl,rl9) 

define (dest2,r20) 

define (dest3,r21) 

define (dest4,r22) 

define (iptr, r23) //index-array pointer 

define (decrem,r24) //bla decrement 
define ( count, r25) // bla counter 

.text 
.align .quad 

_bitrev_: : 

_bitr_:: 

//fetch base address for index table (rbeisetab) 

// base-addr-table elements = (baseaddr, number_of_swaps^2) 

// base-addr-table indexed by logN. 

shl 3,logN,r30 //scale to 8-byte-entry length 

mov rbasetab,r29 

ld.l r29(r30) , iptr 

addu 4,r29,r29 

ld.l r29(r30), count //number of swaps required for this value N 

pfld.d 0(iptr),fO //initiate fetch of first 2 bit-rev indices 

pfld.d 8(iptr)++,f0 

adds -2,r0,decrem//2 swaps per loop 

pfld.d 8(iptr)++,f0 

bla decrem, count, revloop //init LCC 
pfld.d 8(iptr)++,fl6 //get 2 indices, but don't cache the indices 
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revloop:: //2 swaps per loop 
//7.5 cycles consumed for each swap, best case, 
pfld.d 8(iptr)++,fl8 //2 more indices 
fxfr fl6,destl //transfer to integer index regs 
fxfr fl7,dest2 

fld.d destl (astart),f24 //fetch 2 elements to . swap 
fld.d dest2 (astart) ,f26 
fxfr fl8,dest3 

fst.d f24, dest2 (astart) 
fst.d f26, destl (astart) 
fxfr fl9,dest4 
fld.d dest3 (astart), f28 
fld.d dest4 (astart), f30 

pfld.d 8(iptr)++,fl6 //2 more indices 
fst.d f28, dest4 (astart) 
bla decrem, count , revloop // 
fst.d f30, dest3 (astart) 

bri rl 
nop 
// 

// _fetch8_: Touch all 32-byte lines in the 8k data bytes, to get them 

// into dcache. (ASSUMING .lte. 8Kbytes and .gte. 4Kbytes) 

// 

// Invocation= fetch(astart ,num8) 

// Inputs= 

// astart=rl6=pointer to data which is to be touched. 

// num8=rl7 (passed by VALUE, %VAL(), not by reference) 

//— 

// Using RC and RB to improve dcache hit rates, for FFTs bigger than 

// 1024 complex (8kB) . 

// RC=10 causes replacement only of block denoted by RB lsbit. RC=11 disables 

// replacement. 

// 

define (num8,rl7) 
define (FEtch, r26) 

_fetch8_: : 
_fetch_: : 

ld.c dirbase,r30 

or 0x800, r30,r30 // Replace Dcache slot only (RC=10,RB=00) 

st.c r30,dirbase 
// Put 4Kbytes into Dcache slot 0. (The rest after 4kB goes to slotl) . 

adds -4, rO, decrem //4 8-byte-groups per cache line 

adds 508, rO, count //512, but pre-decremented for bla usage 

bla decrem, count ,f loop 
adds -32, astart, FEtch 
floop: : 

bla decrem, count , floop 
fld.d 32 (FEtch) ++,f 30 //dummy load. 

adds -512, num8, count 

be fdone //if data exhausted, quit 

// ld.c dirbase,r30 
or 0x900, r30,r30 // Replace Dcache slot 1 only (RC=10,RB=01) 
st.c r30,dirbase 
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adds 
bla 

fld.d 
floop2: 
bla 

fld.d 
f done : : 

// unlock dcache 
andnot 0xF00,r30,r30 
st.c r30,dirbase 
bri rl 
nop 



-8, count, count //predecr for bla 
decrem, count ,floop2 //set LCC 
32(FEtch)++,f30 

decrem, count , f loop2 
32(FEtch)++,f30 //dummy load. 



//clear RC,RB (dirbase(ll:8) ) 



.data 

// 

// rbasetab:: (Table of bit-reversed indices for bitrev subroutine) 

// base-addr-table elements = (baseaddr, number_of_swaps-2) 

// base-addr-table indexed by logN. 

.align .quad 

rbasetab:: 

.long [6]0 //don't bother with log(n)=0,l,2 

.long rev8, 

4 

10 

26 

54 

118 

238 



.long revl6, 

.long rev32, 

.long rev64, 

•long revl28, 

.long rev256, 

.long rev512, 

.long revl024, 494 



//number of swaps=240 for N=5 12 (ie, 32 symmetrical patterns 

// exist between and 511.) 

// rev512: array of bit-reversed indices, for N=512. 

// Each entry is ( n i n , and "bit-reversed-i") , shifted left by 

// to account for 8-byte-elements. 

// NOTE: This listing DOES NOT SHOW all the table elements, to save paper. 

.align .quad 
rev512:: 



.long 


8, 2048, 


16, 


1024 


.long 


24, 3072, 


32, 


512 


.long 


40, 2560, 


48, 


1536 


// ETC. 
/ /_ — __. 


.., ETC , 


ETC. 




// 

.align 


. quad 






revl024 


: : 






• long 


8, 4096, 


16, 


2048 


• long 


24, 6144, 


32, 


1024 


.long 


40, 5120, 


48, 


3072 


• long 


56, 7168, 


64, 


512 


// ETC. 


. . , JuJL U . a . . , 


ETC. 


• • 
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//Number of swaps 


= 496 


//N (Number of elements) = 1024 
/ / ____ _ _ ________ 


/ /-==-=-=-=-==-===■ 




.align .quad 




revl6 : : 




.long 1*8,8*8,2*8,4*8 


.long 3*8,12*8, 


5*8,10*8 


.long 7*8,14*8, 


11*8,13*8 


rev8: : 




.long 1*8,4*8,3*8,6*8 
/ /_ _____ _ _ _ 


/ /= ______ — =_=_ 




.align .quad 




rev32 : : 




.long 8, 128,16, 64, 24, 192, 40, 160, 48, 96, 56, 224 


.long 72, 144, 


88, 208, 104, 176, 120, 240, 152, 200, 184, 232 


/ / __________ z 




.align .quad 




rev64: : 




.long 8, 256, 


16, 128 


.long 24, 384, 


32, 64 


.long 40, 320, 


48, 192 


.long 56, 448, 


72, 288 


// ETC., ETC 

/ /__ ___ _ __ _ 


, ETC... 


/ /== _=__=____= 




.align .quad 




revl28 : : 




.long 8, 512, 


16, 256 


.long 24, 768, 


32, 128 


.long 40, 640, 


48, 384 


.long 56, 896, 


72, 576 


// ETC., ETC 


, ETC... 


//Number of swaps 
.align .quad 


= 56 (Number of elements) =128 




rev256 : : 




.long 8, 1024, 


16, 512 


.long 24, 1536, 


32, 256 


.long 40, 1280, 


48, 768 


.long 56, 1792, 


64, 128 


// ETC., ETC 


, ETC... 


//Number of swaps 


= 120, N (Number of elements) = 256 
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PROGRAM FFTTEST 
1-D FFT TEST PROGRAM 
Intel assumes no responsibility for use or misuse of this code. 
7/20/89 



C 
C 
C 
C 
C 
C 
C 

c 

character* 8 REALLY 
PARAMETER (IREV=0) 
PARAMETER (REALLY=' complex' ) 
PARAMETER (TIMEIT=1, CACHETIME=0) 
DATA IT/200000/ 
c PARAMETER (N=1024,M=10) 

PARAMETER (N=512,M= 9) 
c PARAMETER (N=256,M= 8) 
c PARAMETER (N=128,M= 7) 
c PARAMETER (N=64,M= 6) 
c PARAMETER (N=32,M= 5) 
c PARAMETER (N=16, M=4) 

PARAMETER (PI=3. 1415926536) 
COMPLEX X(N) ,X1(N) ,X2(N) ,X3(N) , W(N/2) 
c Fortran complex values stored R,I, R,I for arrays. 
Real ASQR(N) ,ASQR2(N),XR(N) 
complex wtemp 
real rtemp 



PRINT *, f FFT test program (ffttest.f) . ...' 

print *, • ==================== f 

IF (IREV .eq. 0) THEN 
print *,'N0T counting time for bit-reversal. 1 
print *,'D0 NOT expect matching answers, without bit-rev 1 

ELSE 
print *, 'Time for bit-reversal included. 1 

ENDIF 

print *, 'Time for cache writeback and fills...' 
IF (CACHETIME .eq. 0) THEN 

print *,' NOT included, if iterating.' 
ELSE 

print *,' ... included.' 
ENDIF 



print 
print 
print 
print 
print 
print 



If iterating. 



Number of Points 
(', REALLY,' data) ' 



Number of Iterations =',IT 
= ', N 
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c 

C Init twiddle factor array w(k) with (cos, -sin) of 2pi*k/N 

C (Should just declare this as constant, if N is non-variable) 

C (OR could have one constant 512-entry W (for N=1024) , adjust wincr accordingly 

C in diff.f for smaller N) 

rtemp = 2.0*pi/N 

wtemp= CMPLX ( cos ( rtemp) , -sin (rtemp)) 

w(l) = (1.0, 0.0) 

DO 200 k = 2, N/2 
200 w(k) = wtemp * w(k-l) 

cc print *,' W (twiddle) initialization completed ' 

CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC 

C INITIALIZE input data 

C 

PIN = (4*PI)/ N 

DO 100 I = 1, N 

c For testing with sinewave input data: 

c Treal = C0S( I*PIN) 

c Timag = SIN( I*PIN) 

c For testing with squarewave input: 
cc IF (I .It. N/2) THEN 

cc Treal = 1.0 

cc Timag =0.5 

cc ELSE 

cc Treal = 0.0 

cc Timag = 0.0 

cc ENDIF 

C For testing with ramp function input data: 

Treal =1-1.0 

Timag = Treal +0.5 

X(I) = CMPLX (Treal, Timag) 

XI (I) = CMPLX (Treal, Timag) 

X2(I) = CMPLX (Treal, Timag) 

X3(I) = CMPLX (Treal, Timag) 
100 CONTINUE 
C 

CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC 
IF (TIMEIT .ne. 0) THEN 

CALL fft (X2, M, N) 
cc Subroutine fft is Decimation-In-Time, Fortran version. 

c CALL ditt(X, M, N,W,IREV) 
CALL diff(X, M, N,W,IREV) 
ENDIF 

ccccccccccccccccccccccccccccccccccccccc 
IF (IREV .ne. 0) THEN 
IF (TIMEIT .eq. 0) THEN 
call vcompare(X,X2,2*N) 
call cmags (X,N,ASQR) 
c cmags to take squared magnitude of complex values 
call cmags (X2,N,ASQR2) 
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c c 

C print non-zero results: 

J=0 

DO 700 I = 1,N 

IF ((ASQR(I) .GT. 1.0) .OR, (ASQR2(I) .GT. 1.0)) THEN 
WRITE (6,22) (1-1), ASQR(I), ASQR2(I) 
22 FORMAT (' 1-1=', 14, • ASQR(I)= » ,F14.2, ' ASQR2(I)= » ,F14.2//) 

J = J+l 

IF (J .GT. 32) GOTO 725 

ENDIF 
700 CONTINUE 

725 CALL TIME 
ENDIF 
ENDIF 

IF (TIMEIT .ne. 0) THEN 
ccccccccccccccccccccccccccccccccccccccc 
cc- Timing loop follows: 

print *,' Start Ass.FFT' 
IF (CACHETIME ,eq. 0) THEN 
DO 500 1=1, IT, 4 
C Reuse same array, so cache fill and writeback time NOT included. 
CALL diff (X, M, N,W,IREV) 
CALL diff(X, M, N,W,IREV) 
CALL diff (X, M, N,W,IREV)' 
500 CALL diff (X, M, N,W,IREV) 
ELSE 

DO 504 1=1, IT, 4 
C Alternating between X,X1,X2,X3 should provide cache misses. 
CALL diff(X, M, N,W,IREV) 
CALL diff (XI, M, N,W,IREV) 
CALL diff (X2, M, N,W,IREV) 
504 CALL diff(X3, M, N,W,IREV) 
ENDIF 

print *, • END Ass. FFT» 
ccccccccccccccccccccccccccccccccccccccc 
ENDIF 
STOP 
END 
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subroutine vcompare (res 


,exp,n) 








c \ 
c 


^COMPARE compare 


)S 2 REAL vectors, 


prints out 1st few 


miscompares 


integer 


n, errcnt 












real res 


(n), 


exp(n) 












write(6, 


12) 












12 


format ( • 


*** VCOMPARE: vector c 


omparison beginning 


#*# i ) 






data errcnt/0/ 














do 30 


i = 


l,n 














if (AINT(res(i)) 


.ne. AINT(exp(i))) then 






c 


[print out 


error, exit if 


alot already] 






12C 


) 




print *, 


»*** Error in compares ***' 












write(6, 


121) i 








121 




format ( ' 


Item 


number = • ,16) 












write (6, 


124) res(i) , exp(i) 






124 




format ( • 


Res_= 


1 ,F14.2,' Expected_= 


:',F14 


.2) 








errcnt = 


errcnt 


+ 1 












if (errcnt .gt. 


19) then 














return 














end if 














end 


if 










30 


continue 














if (errcnt 


.eq. 


0) then 










19C 


) print *, 


» ##>} 


vector compares 


SUCCESSFUL *** f 








end if 














99 


return 
















end 
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c . 

C File: ditt.f 
C 6/15/89 

C Intel assumes no responsibility for use or misuse of this code. 

C FFT - Decimation in TIME, radix-2, inplace, 1-dimen 

C Inputs: 

C A= complex array of input, up to 1024 pts, single-prec float 

C M= log of number of pts 

C = (Number of stages of FFT) 

C N = number of points, ie, N= 2**M = number of pts 

C W= complex array of twiddle factors, length=N/2. 

C REV= ignored parameter. 

C 

C Outputs: 

C A= complex fft of input A. Correct order (bit-reversal done). 

CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC 

subroutine ditt (a, m,N,W, REV) 

integer m,N, i, REV,wlimit 

integer offset, stage, groups, wincr, powers2(0:10) 

complex a(n) ,w(N/2) ,temp 

data powers2 /l , 2 , 4 , 8 , 16 , 32 , 64 , 128 , 256 , 512 , 1024/ 
C Powers2 to avoid calls to POW, DIV 

C Twiddle factor array w(i) has (cos, -sin) of 2pi*i/N 

CC Assume the caller provides w(i), constants ALREADY initialized 

C 

C Pre-touch data, lock into cache, for 8kByte fft: 

IF (N .gt. 513) THEN 

call fetch(a,%VAL(n)) 

ENDIF 

C 

call bitrev(a,%VAL(M) ,n) 
C Bitreversal of input needed for in-place decim in time FFT, to avoid 
C fetching twiddle-factors in bitrev order. 

wlimit = 8* ((N/2) - 1) 

DO 20 stage = l,m 

groups = powers2(m-stage) 
C groups=number of times the twiddle factors are used, ie, the number of 
C smaller DFTs the stage is split into. 

C offset gets 1,2,4,8, .. .N/2 

offset = powers2(stage-l) 

wincr = groups 

call dit st ep( a, w, groups, offset, wincr, wlimit) 
20 CONTINUE 

RETURN 
END 

C 
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// 

// ditstep.ss: do one stage of fft butterflies 
// DIT = Decimation in Time, radix-2, inplace, 1-dimension 
// (C) Copyright 1989 INTEL Corporation. ALL RIGHTS RESERVED. 
// 7/15/89 

// " 

// Intel is not responsible for use nor for misuse of this program. 

// 

// Do one entire stage (n/2 butterflies). Sample invocation: 
// call ditstep(a, w, groups, offset ,wincr,wlimit) 
//===================================:=====:======:=:===== 

// Inputs: 

// A= complex array of input, single-prec float 

// (complex stored as 4byte real, 4byte imag contiguously) 

// W= pointer to array of twiddle factors. Assuming W(k) is 

// CMPLX(cos(2pi*k/N)) ,-sin(2pi*k/N) ) for k=0 to (N/2)-l. 

// offset = distance (except for scale-by-8byte sizeof (complex) ) between 

// the 2 input values for each butterfly. 

// Offset also is the number of butterflies done per "group". 

// groups = N/(2*offset) . The number of sub-DFTs this stage is split into. 

// wincr = distance (except for scale-by-8byte sizeof (complex) ) between 

// successive w values for successive butterflies 

// wlimit =max index, in bytes, of W table. 

// 

// Outputs: 

// A= complex radix-2 butterflied version of input. 

// 

// 

define (astart, rl6) // input data base address 

define (wstart,rl7) //twiddle array ptr. Because w-contents depend on N, 

// we will assume the caller has initialized w() array. 

define (groups, rl8) //groups=number of sub-DFTs this stage is split into. 

define (off set, rl9) //offset (initially elements, mult by 8 to get bytes) 

// between node and its dual (the 2 numbers to butterfly, ie. A and B) 

define (wincr, r20) //increment between successive W values. Remains constant 

// within a given stage. 

define (wlimit, r21) //max index, in bytes, of W table. 

define (wind, r22) //current index, in bytes, of W table. 

define (of fset2,r23) //offset*2 

define (decrem,r24) //bla decrement 
define (somecount ,r25) // bla counter 

define (FEtch, r26) //pointer to 1st component of butterfly (load) 
define (STore,r27) // " " 1st component of butterfly (store) 

define (of fsetp8,r28) //offset+8 
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// f4:f7 spare 

define (ARe,f 12) //element A, real component 

define (Ale, fl3) // •" " , imag . 

define (ARo,f 14) // extra A value, for prefetch (o="odd" 

define (Alo.f 15) 

define (BRe,f 16) //element B, real component 

define (BIe,f 17) 

define (BRo,f 18) // extra B value, for prefetch 

define(BIo,fl9) 

define (ERe,f 20) //A+(B*W) , real (ER = AR + BR) 

define(EIe,f21) // " imag " 

define(ERo,f22) // previous loop's value 

define (Elo.f 23) // n imag ". 

define (FRe,f 24) //A-(B*W), real 

define (Fie, f 25) // " imag " 

define (FRo.f 26) // previous loop's value 

define (Flo, f 27) // " imag n 

define(PR, f28) //(B*W) , real 

define (PI, f29) //(B*W), imag 



define (WRe,f 30) //W (twiddle factor), 

define (WIe,f 31) // n " , imag 

define (WRo,f 10) //W (twiddle factor), 

define (WIo,f 11) // " " , imag 



real part 

real part (EXTRA copy) 



.text 

.align .quad 
_ditstep_: : 

ld.l 0( groups) .groups //fix Fortran call-by-ref 

ld.l O(offset) , offset // 

shl 3, offset, offset //change from elements to bytes 

shl l,offset,offset2 

adds 8,offset,offsetp8 

fst.q f8 ,-16(sp)++ //save "local" regs 
fst.q fl2,-16(sp)++ // " " 

adds -1, groups, groups // pre-decrement for bnc usage, or bla usage 
adds -16,r0,decrem //bla decrement 

// We code the last 2 stages as special cases: 

// 

xor 8, off set, rO //offset=l, special case, no complex mult, funny addressing 
be offset_l// (ASSUMING of fset=l means wincr=0, and no twiddle used) 
xor 16, off set, rO //offset=2, special case, no complex mult 
be offset_2 

//— 

ld.l O(wincr) ,wincr 
ld.l O(wlimit) ,wlimit 
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pfadd.ss fO,fO,fO 

pfadd.ss fO,fO,fO 

pfadd.ss fO,fO,fO // init A1,A2,A3=0 

pfmul.ss fO,fO,fO 

pfmul.ss fO,fO,fO 

pfmul.ss fO,fO,fO 

// 

// init pointers: 

shl 3,wincr,wincr //scale for bytes. 

shl l,wincr,wind //init wind =2*wincr 

pfld.d ( wstart) ,fO 

pfld.d wincr ( wstart) ,fO 

adds -8,astart,FEtch 

pfld.d wind (wstart) ,fO 

adds wincr, wind, wind //wind now 3*wincr 
// here fetch first set of B,W before bla-loop 

pfld.d wind (wstart) ,WRe 

adds wincr, wind, wind 
//first Bfetch from offset, then 1st afetch from 0. 

fld.d offsetp8 (FEtch) ,BRe //first B value 

and wlimit, wind, wind //modulo-wlimit the w index 
// We do modulo-addressing on W(), to keep the pfld pipeline full. We 
// never do a W-fetch beyond the end of the table. 

// And the modulo-check needs to be done only every 4th pfld, as always 
// we use a multiple of 4 W() factors. 

d.r2apl.ss fO,fO,fO //clear Treg. 
adds -32,offset,somecount // bla counter (predecrement by 4 elements) 



Definitions for pipe diagram: 
Anew = E = A+(B*W) 
Bnew = F = A-(B*W) 
Let P=(B*W). 



(the complex multiply product, P, broken into 4 real mult and 2 adds) ; 
WR = cos() , WI=-sin(). 
PR = K - L; where K= WR*BR, L=WI*BI 
PI = N + M; where N= WI*BR, M=WR*BI 
ER = AR + PR (Overwrites AR) 
EI = AI + PI ( ■ AI) 
FR = AR - PR ( " BR) 
FI = AI - PI ( ■ BI) 

For 1st time thru inner — loop, don't have correct values to store. 
Must do 1 loop before the loop, sans the stores. 



rst_bfly:: //fill pipe 
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.Ml. 



.M2. 



.M3 



// KR...KI. 
d.r2pt.ss WRe,fO,fO // WRe - 

pfld.d wind (wstart) ,WRo 
d.i2st.ss WIe,fO,fO // WIe 

adds wincr, wind, wind 
d.r2apl.ss fO ,BRe,fO // KO 

fld.d 8 (FEtch)++,ARe //first A value 
d.pfmul.ss WIe,BIe,fO // 

pfld.d wind (wstart) , WRe 
d.r2pt.ss WRo,BIe,fO // WRo 

fld.d offsetp8 (FEtch) ,BRo 
d.ratls2.ss fO ,PR ,f0// 

adds wincr, wind, wind 
d.i2st.ss WIo,BRe,fO // 

nop 

// • 

d.r2apl.ss fO ,BRo,fO // 

and wlimit, wind, wind 
d.pfsub.ss fO ,PI ,f0 // 

fld.d 8 (FEtch) ++,ARo 
d.pfadd.ss ARe,PR ,PR // 

fld.d offsetp8 (FEtch) ,BRe 
d.pfmul.ss WIo,BIo,fO // 

nop 
d.r2pt.ss WRe, Bio, fO // WRe 

bla decrem,somecount, 
d.ratls2.ss ARe.PR ,f0// 

nop 

restart:: 
d.i2st.ss WIe,BRo,ERe// WIe Nl - Ml 



Al. 



.A2. 



.A3. 



.Write 



LO 


KO 


- 


- 


- 




MO 


LO 


KO 




- 


- 


- 


MO 


LO 


KO 


- ;. 


- 


WIo NO 


- 


MO 


KO 


K-LO 




Kl 


NO 


■ - 


MO 


- 


PRO 


Kl 


NO 


- 


MO 


- 


-' 


Kl 


NO 


.-■■■- * 


MO 


ERO 




LI 


Kl 


NO 


MO 


ERO 


- 


Ml 


LI 


Kl 


MO 


M+NO 


ERO 


•estart //init 


LCC 








- 


Ml 


LI 


Kl 


FRO 


PIO 



PRO 



PRO 



ERO 



Kl K-Ll FRO PIO ERO 



adds -16,astart,STore // ptrs init 16 low, for fst.q instructions 
// 

// Each butterfly = 1 complx multiply, 1 complx add, 1 complx subtract 

// = 4 multiply, 3 add, 3 subtract 

// 3 8-byte fetches (A, B, W) 

// 2 8-byte stores (A, B) 

// 

// 7 cycles per butterfly 

// 

// inner.loop: iterates n offset/2 n times 

// for each group. It does 2 butterflies per iteration 

// AR/AI fetches need to be a cycle behind BR/BI fetches here. So we 

// must index with offset+8 into B. 

// AR is used 1/2 loop before AI. 

// Patterns AI0,AR1,BR2,BI2;AI1,AR2,BR3,BI3. 



inner.loop:: // KR.. 

d.r2apl.ss AIe,BRe,PI // 

pfld.d wind (wstart) ,WRo 
d.pfsub.ss Ale, PI ,FRe// 

fld.d 8(FEtch)++,ARe 
d.pfadd.ss ARo.PR ,PR // 

fld.d offsetp8 (FEtch) ,BRo 
d.pfmul.ss WIe,BIe,fO // 

adds wincr, wind, wind 



.Kl. 



Ml... 


.M2.. 


..M3 


T 


Al.. 


. .A2.. 


..A 3.. 


..Write 


K2 


Nl 


- 


Ml 


EIO 


PR1 


FRO 


PIO 


K2 


Nl 


- 


Ml 


FIO 


EIO 


PR1 


FRO 


K2 


Nl 


- 


Ml 


ER1 


FIO 


EIO 


PR1 


L2 


K2 


Nl 


Ml 


ER1 


FIO 


EIO 


_ 
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d . r2pt . 


3S WRo,BIe,EIe // WRo 




M2 


L2 


K2 




M+Nl 


ER1 


FIO 


EIO 


pfld.d 


wind (wstart),WRe 




















d.ratls2.ss ARo,PR ,FIe// 




- 


M2 


L2 


K2 


FR1 


PI1 


ER1 


FIO 


adds 


wincr, wind, wind 




















d.i2st. 


3S WIo,BRe,ERo// 


WIo 


N2 


- 


M2 


K2 


K-L2 


FR1 


PI1 


ER1 


and 


wlimit, wind, wind //mo 


dulo. 




















// KR.. 


.KI.. 


.Ml.. 


• . M2 ... 


.M3 


T 


Al... 


.A2. . 


..A3.. 


..Write 


d.r2apl 


.ss AIo,BRo,PI // 




K3 


N2 


- 


M2 


Ell 


PR2 


FR1 


PI1 


nop 






















d.pfsub 


.ss AIo,PI ,FRo// 




K3 


N2 


- 


M2 


FI1 


Ell 


PR2 


FR1 


fld.d 


8 (FEtch)++,ARo 




















d.pfadd 


•ss ARe,PR ,PR // 




K3 


N2 


- 


M2 


ER2 


FI1 


Ell 


PR2 


fld.d 


Dffsetp8 (FEtch) ,BRe 




















d.pfmul 


.ss WIo,BIo,fO // 




L3 


K3 


N2 


M2 


ER2 


FI1 


Ell 


- . 


nop 






















d.r2pt. 


3S WRe,BIo,EIo // WRe 




M3 


L3 


K3 




M+N2 


ER2 


FI1 


Ell 


fst.q ERe,16(STore)++ //update ERe/EIe, 


fERo/EI 















d.ratls2.ss ARe,PR ,FIo// 




■ - 


M3 


L3 


K3 


FR2 


PI2 


ER2 


FI1 


bla de 


:rem, somecount , inner. 


loop 


















d.i2st. 


3S WIe,BRo,ERe// 


WIe 


N3 


- 


M3 


K3 


K-L3 


FR2 


PI2 


ER2 


fst.q 


FRe, offset (STore) 




















//update FRe/FIe/FRo/FIo 




















end_inner_loop: : //KEEP Pipelines 


full 
















// RE-init pointers for fetches 


















d.fiadd 


•ss fO,fO,fO 




















adds 


offset2,astart,astart 


//bump to 


next group 












//redo A,B fetches, with proper ptr. 














d.fiadd 


.ss fO,fO,fO 




















fld.d 


offset (astart) ,BRe //get 


first 


BR/BI 


in next group 








d.fiadd 


.ss fO,fO,fO 




















adds 


-8, astart, FEtch 








^ 












last_bfly:: //do final 2 butterfli 


es, start next group 












// KR.. 


.KI.. 


.Ml.. 


. .M2... 


.M3 


T 


Al... 


.A2.., 


.A3. . 


..Write 


d.r2apl 


.ss AIe,BRe,PI // 




KO 


N3 


- 


M3 


EI2 


PR3 


FR2 


PI2 


pfld.d 


wind (wstart) ,WRo 




















d.pfsub 


.ss Ale, PI ,FRe// 




KO 


N3 


- 


M3 


FI2 


EI2 


PR3 


FR2 


fld.d 


8(FEtch)++,ARe 




















d.pfadd 


.ss ARo,PR ,PR // 




KO 


N3 


- 


M3 


ER3 


FI2 


EI2 


PR3 


fld.d 


offsetp8 (FEtch) ,BRo 




















d.pfmul 


.ss WIe,BIe,fO // 




LO 


KO 


N3 


M3 


ER3 


FI2 


EI2 


- 


adds 


wincr, wind, wind 




















d.r2pt.ss WRo,BIe,EIe // WRo 




MO 


LO 


KO 




M+N3 


ER3 


FI2 


EI2 


pfld.d 


wind (wstart) ,WRe 




















d.ratls2.ss ARo,PR ,FIe// 




- 


MO 


LO 


KO 


FR3 


PI3 


ER3 


FI2 


adds 


wincr, wind, wind 




















d.i2st.ss WIo,BRe,ERo// 


WIo 


NO 


- 


MO 


KO 


K-LO 


FR3 


PI3 


ER3 


and 


wlimit , wind, wind 


//modulo 
















// 






















d.r2apl 


.ss AIo,BRo,PI // 




KI 


NO 


- 


MO 


EI3 


PRO 


FR3 


PI3 


adds -32, offset .somecount // 


reset bla 


count e 


r 












d.pfsub 


,ss AIo,PI ,FRo// 




KI 


NO 


- 


MO 


FI3 


EI3 


PRO 


FR3 


fld.d 


8 (FEtch) ++,ARo 
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d.pfadd.ss ARe,PR ,PR // Kl NO - MO ERO FI3 EI3 PRO 

fld.d offsetp8 (FEtch) ,BRe 
d.pfmul.ss WIo,BIo,fO // LI Kl NO MO ERO FI3 EI3 

bla decrem,somecount, nowhere //re-init LCC=1 
d.r2pt.ss WRe,BIo,EIo // WRe Ml LI Kl M+NO ERO FI3 EI3 

adds -1, groups, groups 
nowhere : : 
d.ratls2.ss ARe,PR ,FIo// - Ml LI Kl FRO PIO ERO FI3 

fst.q ERe, 16 ( STore )++ 
d.fnop 

bnc.t restart //branch on value of groups 
d.fnop 

fst.q FRe, offset (STore) 

end_last_bfly: : 
d.fnop 

br endit 
fiadd.ss fO,fO,fO 

fst.q FRe, offset (STore) //repeated for bnc.t untaken case 
.align .quad 

offset_l: : 

// want FEtch=0,2,4,6,8, . . . elements. ASSUMING wincr=0, 

// and that w=(l,0), so that no complex mult needed. 

// E=A+B, F=A-B. (Per double-butterfly loop: 8 pfadd,4 dword fid, 4 fst, 

// 1 bla) (fld.q used to reduce # fids) 

// Performance = 4 cyc/bfly best case. 

//Redefine regs for fld.q, fst.q usage, when A and B adjacent: 
define (AR3,f 12) //element A, real component 
define (AI3,fl3) // " " , imag 

define (BR3,f 14) //element B, real component 

define (BI3,f 15) 

define (AR4,f 16) // extra A value, for prefetch 

define (AI4,f 17) , 

define (BR4,f 18) 

define (BI4,f 19) 

define (ER3, f20) //A+B, real (ER = AR + BR) 

define (EI3, f21) // M imag " 

define(FR3, f22) //(A-B), real 

define (FI3,f23) // n imag 

define (ER4,f24) //A+B, real 

define (EI4,f25) // " imag 

define (FR4,f26) //(A-B), real 

define (FI4,f 27) // ■ imag 

//=======:=============:=:=====:=============== 

adds -16, astart, FEtch 

fld.q 16 (FEtch)++,AR4 

adds -1, groups, somecount // bla counter (predecremented already by 1) 

//using groups=blacount on the offset_l loop, intentionally, 
adds -16, FEtch, STore 
//startup the loop: 



2-428 



iny. 



AP-435 



IP^iUMIMIfW 



// 



-// Al A2. 

// ARn+BRn - 



.A3. 



.Write: 



d.pfadd.ss AR4,BR4,fO 

fld.q 16 (FEtch)++,AR3" 
d.pfadd.ss AI4,BI4,fO // AIn+BIn ERn 

adds -2,rO,decrem //2 bflies per loop 
d.pfsub.ss AR4,BR4,fO // ARn-BRn EIn ERn 

bla dec rem, some count, offsetl_loop //init LCC 



d.pfsub.ss AI4,BI4,ER4 

nop 
// 



// AIn-BIn FRn 



-// Al. 



, .A2.. 



// AR+BR FI- 



offsetl_loop: : 
d.pfadd.ss AR3,BR3,EI4 
nop 

.ss AI3,BI3,FR4 
16 (FEtch)++,AR4 
.ss AR3,BR3,FI4 // AR-BR 

ER4,16(STore)++ 
.ss AI3,BI3,ER3 // AI-BI 



// AI+BI 



ER 



EIn 



..A3.. 



FR- 



FI- 



ERnext 



.Write: 



EI- 



FR- 



EI 



FR 



// AR2-BR2 EI2 
offsetl_loop 
// AI2-BI2 FR2 



ER 



EI 



FR 



FI 



ER2 



EI2 



FI- 



ER 



EI 



FR 



FI 



ERnext 



d.pfadd. 

fld.q 
d.pfsub. 

fst.q 
d.pfsub. 

nop . 
d.pfadd.ss AR4,BR4,EI3 // AR2+BR2 FI 

fld.q 16 (FEtch)++,AR3 
d.pfadd.ss AI4,BI4,FR3 // AI2+BI2 ER2 

nop 
d.pfsub.ss AR4,BR4,FI3 

bla decrem,somecount, 
d.pfsub.ss AI4,BI4,ER4 

fst.q ER3,16(STore)++ 

// 

end_offsetl_loop: : 
d.fiadd.ss fO,fO,fO 

br endit 
fiadd.ss fO,fO,fO 
nop 

// 

.align .quad 

offset_2:: 

// want FEtch=0,l;4,5 ;8,9 ;12,13;. . . elements. 

// ASSUMING wincr=N/4 ( W_addr=0, N/4,0, N/4, 0, ...) , 

// Even-indexed elements identical to offset_l,W=WO, no complex mult. 

// So EReven= (AR+BR) , EIeven= (AI+BI) . 

// So FReven= (AR-BR) , FIeven= (AI-BI) . 



Trivial W( ) factors. 



// Odd components have W=(0,-1). So B*W = (BI,-BR), 
// So ERodd=Re(A+(B*W)) = (AR+BI) EIodd=(AI-BR) , 
/// So FRodd=Re(A-(B*W)) = (AR-BI) FIodd=(AI+BR) . 
// Each fld.q fetches AReven,AIeven,ARodd,AIodd. 

//Assume ERe,EIe,ERo,EIo are 4 contiguous regs. 
//Assume FRe,FIe,FRo,FIo are 4 contiguous regs. 
//Assume ARe,AIe,ARo,AIo are 4 contiguous regs. 
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adds -16,astart,FEtch 








fld.q 16 (FEtch)++,ARe 








fld.q 16 (FEtch)++,BRe 








adds 0, groups, somecount //bla counter 






//startup the loop: 








If . // Al A2.... 


..A3... 


...Write: 




pfadd.ss ARe,BRe,fO // AR+BRe 




- 




pfadd.ss AIe,BIe,fO // AI+BIe ER 




- \ 




d. pfadd.ss ARo,BIo,fO // ARo+BIo EI 


ER 






nop 








d.pfsub.ss AIo,BRo,ERe // AIo-BRo ERo 


EI 


ER 




nop 








d.pfsub.ss ARe,BRe,EIe // AR-BRe EIo 


ERo 


EI 




ads -l,rO,decrem //2 bflies per lo 


op, but 


groups is half desired value. 


d.pfsub.ss AIe,BIe,ERo // AI-BIe FR 


EIo 


ERo 




adds -16,astart,STore 








d.pfsub.ss ARo,BIo,EIo // ARo-BIo FI 


FR 


EIo 




bla decrem, somecount , offset2_loop //init LCC 






d. pfadd.ss AIo,BRo,FRe // AIo+BRo FRo 


FI 


FR 




nop 








offset2_loop: : 








d.fnop 








fld.q 16 (FEtch)++,ARe//fetch AR,AI,ARo 


,AIo 






d.fnop 








fld.q 16 (FEtch)++,BRe 








// // Al A2 


..A3... 


...Write: 




d. pfadd.ss ARe,BRe,FIe // AR+BRe Flo 


FRo 


FI 




nop 








d. pfadd.ss AIe,BIe,FRo // AI+BIe ER 


Flo 


FRo 




nop 








d. pfadd.ss ARo,BIo,FIo // ARo+BIo EI 


ER 


Flo 




fst.q ERe,16(STore)++ //update ER , 


EI ,ERo 


,EIo 




d.pfsub.ss AIo,BRo,ERe // AIo-BRo ERo 


EI 


ER 




nop 








d.pfsub.ss ARe,BRe,EIe // AR-BRe EIo 


ERo 


EI 




®nop 








d.pfsub.ss AIe,Ble,ERo // AI-BIe FR 


EIo 


ERo 




fst.q FRe,16(STore)++ 








d.pfsub.ss ARo,BIo,EIo // ARo-BIo FI 


FR 


EIo 




bla decrem, somecount, of fset2_loop 








d. pfadd.ss AIo,BRo,FRe // AIo+BRo FRo 


FI x 


FR 




nop 








endit : : 








// restore regs 








fiadd.ss fO,fO,fO //exit DIM 








fld.q 0(sp) ,fl2 








fiadd.ss fO,fO,fO //last DIM pair 








fld.q 16(sp) ,f8 








adds 32,sp,sp 








bri rl 








nop 

// . 








/ / 
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c . 

C File: dirr.f 

C FFT - Decimation in Freq, radix-2, inplace, 1-dimen, 

C REAL input 

C Intel is not responsible for use nor misuse of this code. 

C 8/14/89 

C Inputs: 

C A= REAL array of input, up to 1024 pts, single-prec float 

C M= log of number of pts 

C = (Number of stages of FFT) 

C N = number of points, ie, N= 2**M = number of pts 

C W= complex array of twiddle factors, length N/2. 

C REV= if bitreversed output ok. l=must re-order output 

C (REV will be ignored, and output will be properly ordered. Bit 

C reversal WILL be done.) 

C 

C Outputs: 

C A= complex fft of input A, but only the positive frequency half. 

C Length = N/2+1 complex numbers. A(0:n/2) 

C 

subroutine dirr (a,m,N,W,REV) 

integer m,N, i, j,k, REV, wlimit 

integer offset, stage, groups, wincr,powers2(0 :10) 

real a(N) 

complex w(N/2) ,temp 

data powers2 /l, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024/ 
C Powers2 to avoid calls to POW, DIV 

C Twiddle factor array w(k) has (cos, -sin) of 2pi*k/N 

CC Assume the caller provides w(k) constants ALREADY initialized 

C 

C Pre-touch data, for 8kByte fft: (2048 points real) 

IF (N .gt. 1025) THEN 

call fetch(a,%VAL(n/2)) 

ENDIF 
C 

wlimit = 8* ((N/2) - 1) 

C "DO 20" stage-loop: doing Complex FFT on length N/2 array. Twiddles are 
C for a length N array, so wincr gets scaled by 2. 
DO 20 stage = l,m-l 

groups = powers2(stage-l) 
C groups=number of times the twiddle factors are used, ie, the number of 
C smaller DFTs the stage is split into. 

C offset gets N/4, N/8, N/16, .. . 

offset = powers2(m-l-stage) 

wincr = groups * 2 

call di f st ep ( a, w, groups, off set, wincr, wlimit) 
20 CONTINUE 

call bitrev(a,%VAL(M-l) ,n/2) 
call realfix(a,w,%VAL(n) ) 

RETURN 
END 
C 
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// realfix.ss: This is i860(tm) CPU assembly code to revise data from an 

// N/2 length Complex FFT. 

'// (assumes the input data fed to Complex FFT was N real values) 

// 

// INTEL is not responsible for use nor misuse of this code. 

// 

// 8/14/89 

// This 18-cycle-butterfly loop may be sub-optimal. 

II 

// output = overwrite the data array used for input. Results are 

// complex. ReO,ImO,Rel,Iml, .. . , Re (N/2) ,Im(N/2) . 

// NOTE that output array is 1 element longer than input. 

// 

// Input is H(k), output is F(k)... 

// F(k)=.5*( H(k)+ Hconj (N/2-k) -j*(H(k) -Hconj (N/2-k) ) *Wconj (k) ) 

// 

// Algorithm from "Numerical Recipes in C", by Flannery, Press, Teukolsky, and 
// Vetterlirig, Cambridge Univ. Press 1988, p. 417. 

//* The C-version of realfix: */ void realfix_(a,w,n) 

///*Input = 

// a(0:n+l): length n/2+1 complex array. Entries 0:n/2-l are the complex FFT 

// * result, in correct (NON BIT REVERSED) order. Entry n/2 is undefined. 

// * w: length n/2 complex array of twiddles. (cos,-sin(2pi*k/n) ) 

// * n: call-by-value, number of REAL input samples 

// *0utput = 

// * a(0:n+l) : length n/2+1 complex array. 

// * Format is ReO,ImO,Rel,Iml, . . . , Re (N/2) ,Im(N/2) . 

// * NOTE: To generate entire N-length complex output spectrum, you can copy 

// * conjugate of element (i) to element (N-i) . 

// */ 

//float a[], w[] ; int n; { int aptr,bptr, wptr; float half=0.5, 

// AR,AI,BR,BI, /* input values for A,B*/ 

// PR,PI,SR,SI,DR,DI, /^temporary differences, sums, products*/ 

// K,L,M,N, /^temporary products */ 

// ER,EI,ERD,EID, 

// FR,FI,FRD,FID, 

// WR,WI; 

///*We do first and last elements as special case(lmag=0, W=(1,0))*/ 
// AR = a[0] ; AI = a[l] ; 

// a[0] = AR + AI ; a[l] = ; 
// a[n] = AR - AI ; a[n+l] = ; 



2-432 



iny. 



AP-435 



pftiuiMoiMW 



//for(aptr=2, bptr=(n-2) , wptr=2; aptr < n/2 ; aptr +=2, bptr -=2, wptr +=2) 

//{WR = w[wptr] ; WI = w[wptr+l] ; 

// AR = ataptr] ; AI = a[aptr+l] ; 

// BR = atbptr] ; BI = a[bptr+l] ; 

// /* aptr =2,4,6. ..,14; bptr=30,28,26, . . . ,18 (if n=32) */ 

// /* Note that there is no need to revise the value at the middle of the 

// list, as it is already correct. ( .5* (H(n/4)+Hconj (n/4) ) */ 

// SI = (AI + BI) ; 

// DR = (BR - AR) ; 

// K = WR*SI; L= WI*DR; PR = K-L ; 

// M = WR*DR; N= WI*SI ; PI = M+N ; 

// SR = (AR + BR) ; 

// DI = (AI - BI) ; 

// ERD = SR+PR ; ER = half *ERD ; 

// ataptr] = ER ; 

// EID = DI+PI ; EI = half *EID ; 

// a[aptr+l]= EI ; 

// FRD = SR-PR; FR = half *FRD ; 

// atbptr] = FR; 

// FID = PI-DI; FI = half*FID; 

// atbptr+l]= FI ; ] /*end of for-loop */ I 

y^************* E n( j Q f c-code for realfix.*********************** 

.text 

.align .quad 

// 

define (astart, rl6) //input data base address 

define (wptr, rl7) // pointer to W table. Because w-contents depend on N, 

// we will assume the caller has initialized w() array. 

define (N,rl8) // 

define (aptr, r20) //pointer to 1st component of butterfly (load) 

define (bptr, r21) //pointer to 2nd component of bfly (load) ; DOWNCOUNTER 

define (decrem,r24) //bla decrement 
define ( count, r25) // bla counter 

define (WR, fl8) //W (twiddle factor), real part 
define (WI, fl9) // " " , imag 

define (AR, fl2) //element A, real component 

define (AI, fl3) // " " , imag 

define (ARo,f 14) // extra A value, for prefetch (o="odd M ) 

define(AIo,fl5) 

define (BR, fl6) //element B, real component 

define (BI, fl7) 

define (ER, f20) //Result of butterfly which overwrites AR 
define (EI, f21) // " " " " AI 

define (half, f22) //constant 0.5 

define (FR, f24) //Result of butterfly which overwrites BR 
define (FI, f25) 
define (PR, f 26) 
define (PI, f 27) 

define (DR, f28) 
define (DI, f29) 
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define (SR, f 30) //Sum of A+B, real part 
define (SI, f31) // " w , imag n 

.data 

.align .double 

halfloc: : .float 0.5 

// 

.text 

.align .quad 
„realfix_:: 

fst.q fl2,-16(sp)++ //save "local" regs 

adds -4,r0,decrem //bla decrement 

// 

// We do not bother to initialize FP pipes to zero here, as we assume 
// this routine is called after another, "safe" , pipelined FP routine. 

pfld.l halfloc,fO 

pfld.d 8( wptr)++,fO //skip W(0) intentionally. Is a trivial (1,0) value 

// init pointers: 

adds 0,astart,aptr 
pfld.d 8( wptr)++,fO 

shl 2,N,bptr //bptr=total # bytes of input data 
pfld.d 8( wptr)++,half //0.5 into an fpr 

adds bptr,astart,bptr // bptr points to a(N) 

// here fetch first set of A,B,W before bla-loop 
pfld.d 8( wptr)++,WR 

fld.d (aptr) ,AR //for 1st and last elements 

adds -8, N, count // bla counter (predecrement by 2 butterflies worth) 

// Do n/4 butterflies: (computing only N/2 elements of complex output , because 
// the second N/2 are just complex conjugates of the 1st N/2) 

// Definitions for pipe diagram: 

// WR = cos() , WI=-sin() . 

// DR = BR - AR; (dif fence of Real components of A,B) 

// DI = AI - BI ; (diffence of Imag components) 

// SR,SI = sum of A,B 

// PR = K - L; where K= WR*SI, L=WI*DR 

// PI = M + N; where M= WR*DR, N=WI*SI 

// (ER,EI)=complex result to overwrite A. 

// (FR,FI)=" " " " B. 

first-fly:: //fill pipe. 
// For Oth butterfly: 
// AR = a[0] ; AI = a[l] ; 

// a[0] = AR + AI ; a[l] = ; 
// a[n] = AR - AI; a[n+l] = 0; 



r2pt.ss fO,fO,fO 

mrmlp2.ss AR,AI,fO 

mrmls2.ss AR,AI,fO 

fld.d 8 (aptr)++,AR 

fld.d -8(bptr)-H-,BR 

d.pfadd.ss fO,fd,fO 

d.pfadd.ss fO,fO,ER 



// KR. 


.KI. 


.Ml... 


.M2... 


.M3 


T Al.. 


..A2. 


...A3 


// o 

















// 










- 


ERO 


- 


- 


// 













FR 


ER 


- 


// 
















FR 


ER 


// 



















FR 



.Write 



ERO 
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d.ralp2.ss AI ,BI ,FR // 


_ 








. 


SI1 


. 


. 


FRO 


nop 


















d.mrmls2.ss BR ,AR ,EI // 


- 


- 





- 


DR1 


SI1 


- 


EIO 


fst.d ER,-8(aptr) 


















d.mr2pt.ss WR ,fO, FI // WR 


- 


- 


- 


- 


- 


DR1 


SI1 


FIO 


fst.d FR, 8(bptr) 


















d.ralp2.ss BR ,AR ,SI // 


Kl 


- 


- 


- 


SRI 


- 


DR1 


SI1 


andh 0x8000, count ,rO //check fc 


>r negative 














d.ml2tpm.ss WI ,DR ,DR // 


LI 


Kl 


- 


- 


- 


SRI 


- 


DR1 


bnc endfix 


















d.r2pt.ss half,DR, fO //half 


Ml 


LI 


Kl 


- 


- 


- 


SRI 


- 


nop 


















d.ml2ttpa.ss WI ,SI ,SR// 


Nl 


Ml 


LI 


Kl 


- 


- 


- 


SRI 


nop 


















d.i2st.ss fO ,f0 ,f0// fO 


- 


Nl 


Ml 


Kl 


PR1 


- 


- 


- 


nop 


















// KR..KI. 


.Ml... 


.M2.. 


.M3 


T 


Al... 


.A2... 


.A3... 


.Write 


d.ratls2.ss AI ,BI ,f0 // 


- 


- 


Nl 


Ml 


DI1 


PR1 


- 


- 


nop 


















d.i2pt.ss fO ,f0, fO// fO 


- 


- 


- 


Ml 


PI1 


DI1 


PR1 


- 


fld.d 8 (aptr)++,AR 


















d.r2apl.ss SR ,fO, PR// 


- 


- 


- 


- 


ERD 


PI1 


DI1 


PR1 


fld.d -8(bptr)++,BR 


















d.rals2.ss SR ,PR, DI // 


- 


- 


- 


- 


FRD 


ERD 


PI1 


DI1 


pfld.d 8( wptr)++,WR 


















d.r2apl.ss DI ,f0, PI// 


- 


- 


- 


- 


EID 


FRD 


ERD 


PI1 


nop 


















d.rals2.ss PI ,DI ,f0 // 


ER1 


- 


- 


- 


FID 


EID 


FRD 


- 


nop 


















d.ralp2.ss fO ,f0 ,f0 // 


FR1 


ER1 


- 


- 


- 


FID 


EID 


- 


nop 


















d.rals2.ss fO ,f0 ,f0 // 


Ell 


FR1 


ER1 


- 


- 


- 


FID 


- 


bla decrem, count ,fix_loop 


















d.pfadd.ss fO ,f0 ,FI // 


Ell 


FR1 


ER1 


- 


- 


- 


-FID 




nop 


















// — ~ 


// Each butterfly = 1 complx multiply, 


3 complx add, 1 


real multiply 






// = 8 multiply, 10 add/subtract 
















// 3 8-byte fetches (A, 


B, W) 
















// 2 8-byte stores (E, F) 
















// 


















// approx. 18 cycles per butterfly 
















// 
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fix_loop:: // KR. 


.KI..M1.. 


• .M2. • 


,.M3 


T 


AI.. 


,.A2.. 


..A3.. 


.Write 


d.mr2pt.ss fO ,FI ,ER // 


FI1 


Ell 


FR1 


- 


- 


- 


- 


ER1 


nop 


















d.mrmlp2.ss AI ,BI ,FR // 


- 


FI1 


Ell 


- 


SI2 


- 


- 


FR1 


nop 


















d.mrmls2.ss BR ,AR ,EI // 


- 


- 


FI1 


- 


DR2 


SI2 


- . 


Ell 


fst.d ER,-8(aptr) 


















d.mr2pt.ss WR ,fO, FI // WR 


- 


- 


- 


- 


- 


DR2 


SI2 


FI1 


fst.d FR, 8(bptr) 


















d.ralp2.ss BR ,AR ,SI // 


K2 


- 


- 


- 


SR2 


- 


DR2 


SI2 


andh 0x8000, count, rO //check for negative 














d.ml2tpm.ss WI ,DR ,DR // 


L2 


K2 


- 


- 


- 


SR2 


- 


DR2 


bnc endf ix 


















d.r2pt.ss half,DR, fO //half 


M2 


L2 


K2 


- 


- 


- 


SR2 


- 


nop 


















d.ml2ttpa.ss WI ,SI ,SR// 


N2 


M2 


L2 


K2 


- 


- 


- 


SR2 


nop 


















d.i2st.ss fO ,f0 ,f0// 


fO - 


N2 


M2 


K2 


PR2 


- 


- 


- 


nop 


















// KR. 


.KI..M1.. 


. *M2. . 


,.M3 


T 


AI.. 


. .A2.. 


..A3. . 


..Write 


d.ratls2.ss AI ,BI , fO// 


- 


- 


N2 


M2 


DI2 


PR2 


- 


- 


nop \ 


















d.i2pt.ss fO ,f0, fO// 


fO - 


- 


- 


M2 


PI2 


DI2 


PR2 


- 


fld.d 8 (aptr)++,AR 


















d.r2apl.ss SR ,fO, PR// 


- 


- 


- 


- 


ERD 


PI2 


DI2 


PR2 


fld.d -8(bptr)++,BR 


















d.rals2.ss SR ,PR, DI// 


- 


- 


- 


- 


FRD 


ERD 


PI2 


DI2 


pfld.d 8( wptr)++,WR 


















d.r2apl.ss DI ,fO, PI// 


- 


- 


- 


- 


EID 


FRD 


ERD 


PI2 


nop 


















d.rals2.ss PI ,DI ,fO // 


ER2 


- 


- 


- 


FID 


EID 


FRD 


- 


nop 


















d.ralp2.ss fO ,fO ,fO // 


FR2 


ER2 


- 


T 


- 


FID 


EID 


- 


nop 


















d.rals2.ss fO ,fO ,fO // 


EI2 


FR2 


ER2 


- 


- 


- 


FID 


- 


bla decrem, count, fix_loop 


















d.pfadd.ss fO ,f0 ,FI // 


EI2 


FR2 


ER2 


- 


- 


- 


- 


FID 


nop 


















//——--—- 


endfix: : 


















// restore regs 


















fiadd.ss fO,fO,fO //exit DIM 


















fld.q 0(sp),fl2 


















fiadd.ss fO,fO,fO //last DIM 


pair 
















adds 16,sp,sp 


















bri rl 


















nop 


















//—————-— 
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PROGRAM FFTTEST 
c file = real.f 
C 

C 1-D FFT TEST PROGRAM 
C 
C 8/14/89 

C Intel assumes no responsibility for use or misuse of this code. 
C 

PARAMETER (IREV=1) 

character*8 really 

PARAMETER (REALLY= f real ' ) 
c PARAMETER (REALLY=' complex' ) 

PARAMETER (TIMEIT=0, CACHETIME=0) 
c REALLY=*real' means real-only input, otherwise assume complex input 

DATA IT/200000/ 
c PARAMETER (N=2048,M=11) 

PARAMETER (N=1024,M=10) 
c PARAMETER (N=512,M= 9) 
c PARAMETER (N=256,M= 8) 
c PARAMETER (N=128,M= 7) 
c PARAMETER (N=64,M= 6) 
c PARAMETER (N=32,M= 5) 
c PARAMETER (N=16, M=4) 

PARAMETER (PI=3. 1415926536) 
COMPLEX X2(N) ,X(N) ,X3(N) , W(N/2) 

Real ASQR(N) ,ASQR2(N) ,XR(N+2) ,XRl(N+2) ,XR2(N+2) ,XR3(N +2) 
complex wtemp 
real rtemp 
C 

PRINT *,' FFT test program . ...» 

print *,» ======================== • 



IF (IREV .eq. 0) THEN 
print *,»NOT counting time for bit-reversal.' 
print *,'D0 NOT expect matching answers, without bit-rev' 

ELSE 
print *, 'Time for bit-reversal included.' 

ENDIF 

print *, 'Time for cache writeback and fills...' 
IF (CACHETIME .eq. 0) THEN 

print *,' NOT included, if iterating.' 
ELSE 

print *,' ... included.' 
ENDIF 



print *, 
print * , 
print *, 
print *, 
print *, 
print *, 



If iterating... Number of Iterations =',IT 



Number of Points 
(', REALLY,' data) » 
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C Init twiddle factor array w(k) with (cos, -sin) of 2pi*k/N 

rtemp = 2.0*pi/N 

wtemp= CMPLX(cos (rtemp) , -sin(rtemp)) 

w(l) = (1.0, 0.0) 

DO 200 k = 2, N/2 
200 w(k) = wtemp * w(k-l) 

cc print *,' W (twiddle) initialization completed • 

CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC 

C INITIALIZE input data 

C 

N 



THEN 





DO 100 I = 1 


c :constant : 


c 


Treal = 1.0 


c 


Timag =0.0 


c u 


squarewave : 


cc 


IF (I .It. N/2) 


cc 


Treal = 1.0 


cc 


Timag =0.5 


cc 


ELSE 


cc 


Treal =0.0 


cc 


Timag =0.0 


cc 


ENDIF 


C: 


ramp function: 




Treal =1-1 




Timag = Treal 




IF (REALLY .ne. 




X(I) = C 




X2(I) = 



.0 

+ 0.5 

•real 1 ) THEN 
CMPLX (Treal, Timag) 
CMPLX (Treal, Timag) 
X3(I) = CMPLX (Treal, Timag) 
ELSE 

X(I) = CMPLX (Treal, 0.0) 
X2(I) = CMPLX (Treal, 0.0) 
XR(I) = Treal 
XR1(I) = Treal 
XR2(I) = Treal 
XR3(I) = Treal 
ENDIF 
100 CONTINUE 
C - 
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC 

CALL fft (X2, M, N) 
cc Subroutine fft is Decimation-In-Time, Fortran version. 

CALL dirr(XR,M,N,W,l) 
c (Assuming dirr produces inplace result, items 0:N/2 complex results) 
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ccccccccccccccccccccccccccccccccccccccc 

IF (IREV .ne. 0) THEN 

IF (TIMEIT .eq. 0) THEN 

call vcompare(XR,X2, N/2+2) 

call cmags(XR, N/2+1, ASQR) 
c cmags to take squared magnitude of complex values in X 

call cmags (X2,N,ASQR2) 

c c 

C print non-zero results: 

J=0 

DO 700 I = 1, N/2+1 

IF ((ASQR(I) .GT. 1.0) .OR. (ASQR2(I) .GT. 1.0)) THEN 
WRITE (6,22) (1-1), ASQR(I), ASQR2(I) 
22 FORMAT (' 1-1=', 14,' ASQR(I)= ' ,F14.2, ' ASQR2(I)= • ,F14.2//) 

J = J+l 

IF (J .GT. 32) GOTO 725 

ENDIF < ■' 

700 CONTINUE 

725 CALL TIME 
ENDIF 
ENDIF 

IF (TIMEIT .ne. 0) THEN 
ccccccccccccccccccccccccccccccccccccccc 
cc- Timing loop follows: 

print *,' Start Ass.FFT' 
IF (CACHETIME .eq. 0) THEN 
DO 500 1=1, IT, 4 
C Reuse same array, so cache fill and writeback time NOT included. 
CALL dirr(XR, M, N,W,IREV) 
CALL dirr(XR, M, N,W,IREV) 
CALL dirr(XR, M, N,W,IREV) 
500 CALL dirr(XR, M, N,W,IREV) 
ELSE 

DO 504 1=1, IT, 4 
C Alternating between XR , XR1 , XR2 , XR3 should provide cache misses. 
CALL dirr(XR, M, N,W,IREV) 
CALL dirr(XRl, M, N,W,IREV) 
CALL dirr(XR2, M, N,W,IREV) 
504 CALL dirr(XR3, M, N,W,IREV) 
ENDIF 

print *, • END Ass. FFT* 
ccccccccccccccccccccccccccccccccccccccc 
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ENDIF 

STOP 
END 

c— c 

subroutine vcompare (res,exp,n) 
c VCOMPARE compares 2 vectors, prints out 1st few miscompares 
c 

integer n, errcnt 
real res(n), exp(n) 

write(6,12) 
12 formate*** VCOMPARE: vector comparison beginning ***•) 

data errcnt/O/ 

do 30 i = l,n 

if (AINT(res(i)) .ne. AINT(exp(i) ) ) then 
c {print out error, exit if alot already) 

120 print *,'*** Error in compares ***» 
write(6,121) i . 

121 formate Item number = ' ,16) 
write (6, 124) res(i), exp(i) 

124 formate Res_= f ,F14.2, » Expected^' ,F14.2) 

- errcnt = errcnt + 1 

if (errcnt .gt. 19) then 

return 
end if 
end if 
30 continue 

if (errcnt .eq. 0) then 
190 print *,' *** vector compares SUCCESSFUL ***' 
end if 

99 return 
end 

c c 
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c 

C file: fft.f 

C FFT routine from Rabiner & Gold, 1975, who copied it 

C from Cooley, Lewis, Welch 

C 6/02/89 

C 

C Decimation in Time, radix-2, inplace, 1-dimen 

C Inputs: 

C A= complex array of input, up to 1024 pts, single-prec float 

C (maybe more than 1024, uncertain what limit is) 

C M= log of number of pts 

C = (Number of stages of FFT) 

C N = number of points, ie, N= 2**M = number of pts 

C 

C Outputs: 

C A= complex fft of input A, in NON-bit-reversed order. 

C 

C w (twiddle factor) calculated by recursion. Supposedly takes 15% more 

C operations than keeping entire twiddle array as constants pre-allocated. 

C 

subroutine fft(a,m,n) 

integer m,n, i, j,k, ndiv2,powers2(0:10) 

integer iplus, offset , stage, indexl, groups 

complex a(n) ,wtemp(2) ,w(ll) ,temp 

C Init twiddle factor array w() with (cos, -sin) of pi, pi/2, pi/4, .. . 
data w(l) /(-l. 0,0.0) / 
data w(2) /(0. 0,-1.0) / 
data w(3) /(0. 7071068,-0. 7071068) / 
data w(4) /(0. 9238795,-0. 3826834) / 
data w(5) /(0. 9807853,-0. 1950903) / 
data w(6) /(0. 9951847, -0.0980171)/ 
data w(7) /(0. 9987955, -0.0490677)/ 
data w(8) /(0. 9996988,-0. 0245412) / 
data w(9) /(0. 9999247, -0.0122715)/ 
data w(10) /(0. 9999812, -0.0061359) / 
data w(ll) /(0. 9999953, -0.003068) / 

data powers2 /l, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024/ 
C Powers2 to avoid calls to POW, DIV 

C Setup for bit-reversal loop: 
ndiv2 = n / 2 
J =1 

C— — — 

C "DO 7 n loop to in-place-bit-reverse-shuffle input 
DO 7 i= 1, n-1 

IF (i .It. j) THEN 
temp = a(j) 
a(j) = a(i) 
a(i) = temp 
ENDIF 
k = ndiv2 
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C "While (j .gt. k) n /^decrease j by 2**something */ 


6 IF (J .gt. k) THEN 


J = J-k 


k = k / 2 


GOTO 6 


ENDIF 


C Add next lower power of 2 to j 


7 j = j+k 

C ; 

C Special case for stage 1: no complex multiplies, simple add 


C (Performance enhancement) 


groups =2 


offset =1 


indexl =1 


C i-loop iterates N/2 times for 1st stage (and would do twice N/4 x for 2nd) 


CVD$ NODEPCHK 


DO 8 i = l,n,2 


iplus = i + 1 


temp = a (iplus), 


a (iplus) = a(i) - temp 


8 a(i) = a(i) + temp 
c 

C Special case for stage 2: no complex multiplies, simple add 


C (Performance enhancement) 


groups = 4 


offset =2 


indexl =1 


C i-loop iterates N/4 times for 2nd stage 


C 1st call to i-loop, in stage2: indexl=l, wtemp(l)=(l,0) 


CVD$ NODEPCHK 


DO 90 i = l,n,4 


iplus = i + 2 


temp = a (iplus) 


a(iplus) = a(i) - temp 


90 a(i) = a(i) + temp 


indexl =2 


CVD$ NODEPCHK 


CVD$ NOVECTOR 


DO 92 i = 2,n,4 


iplus = i + 2 


temp = CMPLX(AIMAG (a (iplus) ) ,-REAL (a (iplus))) 


a (iplus) = a(i) - temp 


92 a(i) = a(i) + temp 


CVD$ VECTOR 
C 

C "DO 20" stage-loop executed once for each of the (m) stages of FFT 


C (Except 1st and 2nd stage) 


C offset gets 4,8,16,32,64,128,256... 


DO 20 stage = 3,m 


groups = powers2( stage) 


offset = groups/2 


wtemp(l) =(1.0, 0.0) 


C One twiddle seed (W) calc per stage. 


C We pre-allocated w(12) -array with those values, avoid cos/sin calls 
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c 

DO 20 indexl = 1, offset 

C "DO 10" i-loop does each butterfly of each stage, with varying twiddles 
C i-loop iterates N/2 times for 1st stage, N/4 x for 2nd, N/8 x for 3rd 
C stage, N/16 x for 4th stage,... 1 time for last stage. 

CVD$ NODEPCHK 
CVD$ ALTCODE 

DO 10 i = indexl, n, groups 
iplus = i + offset 
temp = a(iplus) * wtemp(l) 
a (iplus) = a(i) - temp 
10 a(i) = a(i) + temp 

20 wtemp(l) = wtemp(l) * w(stage) 
RETURN 
END 

C 

subroutine cmags(a,n,asqr) 
C Complex magnitude squared. 
C Inputs: 

C A= complex array of input, single-prec float 
C N = number of input points (and output points) 
C Ouput : 
C asqr = real squared magnitude (R*R + 1*1), N elements, single-prec float 

integer n,i 
real asqr(n) 
complex a(n) 

DO 100 i = 1, n 

asqr(i) = (REAL(a(i) ) *REAL(a(i) ) ) + (AIMAG(a(i) ) *AIMAG(a(i) ) ) 
100 CONTINUE 
RETURN 
END 
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## makefile for i860(tm) CPU FFTs (for Unix V/386 programming environment) 

## 8/7/89 

## 

GH=/usr/i860/bin 

GHL=/usr/i860/lib 

CC=$(GH)/c860 

FC=$(GH)/f860 

CFLAGS= -OLNI -X393 -X405 -X188 -X370 

FFLAGS= -OLNI -X370 -X393 -X71 -X422 

## -X71 uses single-precision math routines 

FLFLAGS= -Mx map -e start 

LFLAGS= -Mx map -e _main 
CLIB=$(GHL)/libc.a 
MLIBPSR=$ ( GHL) /860mtlib . a 

MLIB=$(GHL)/libm.a 
FLIB=$(GHL)/libf.a 

ASM=$(GH)/as860 

FLINK=$(GH)/ld860 l(FLFLAGS) 

RT=$(GHL)/s51ib.a 

LIBS= $(FLIB) l(MLIBPSR) $(MLIB) $(CLIB) $(RT) 

LIBCC= ft(MLIB) $(CLIB) $(RT) 

## NOTE: Order of linked files is CRUCIAL, other orders may give errors 

.SUFFIXES: 

.SUFFIXES: .f .c .s .ss ,o .8 

.IGNORE: 

## .ignore causes make to ignore error codes from compilers 

## To test Fortran plus assembler-fft-stage version: 

FILE= ffttest.o fft.o diff.o bitrev.o difstep.o start. o time.o 

## To test all-Fortran version of f f t : 

##FILE= ffttest.o fft.o diff.o difstepf.o start. o time.o 

## To test REAL-input version of f f t : 

RFILE= real.o fft.o dirr.o realfix.o difstep.o bitrev.o start. o time.o 

.f.o: 

$(FC) ft(FFLAGS) $*.f 
$(ASM) -x -o $*.o $*.s 

.c.o: 

$(CC) $(CFLAGS) $*.c 
$(ASM) -x -o $*.o $*.s 
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• s.o : 








m4 ft*.s 

ft (ASM) - 
ffttest.8: 

ft(FLINK) 
real. 8: 

$(FLINK) 


temp2.s 

-x -o $*.o temp2.s 
$(FILE) 

-o ffttest.8 $(FILE) $(LIBS) 
$(RFILE) 

-o real. 8 ft(RFILE) ft (LIBS) 


clean: 








rm -f *, 


o *.8 






.ss.o: 








m4 $*.se 
$(ASM) - 


> temp.s 
•X -o $*.o 


temp.s 
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//start. ss 




// 8/18/89 




// Fortran 


runtime startoff routine 


// 




• text 




.globl 


start 


.globl 


finish 


start : : 




orh 


h%_stack+262128+262144,r0,sp 


or 


L^_stack+262128+262144,sp,sp 


adds 


-16,sp,sp 


st.l 


rl,12(sp) 


call 


_main 


nop 




finish:: 




call 


_exit 


nop 




.file 


n start. c" 


.data 




.align 


. quad 


•lcomm 


..stack , 262144+262144 


• end 

/ / 




// — — 

/* file: time.c. Purpose: establish a label to use for breakpoints */ 


long 


time_(x) 


long 


*x; 


{ x = x+4 


» 


return ( (long) x) ; 


long 


timestop_(x) 


long 


*x; 


{ x = x+4 


» 


return ( (long) x) ; 
i 
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1.0 BACKGROUND 

The Intel 82495 Cache Controller and 82490 Cache 
RAM form a high-speed cache subsystem for the 
Intel486 DX CPU (82495DX/490DX) or the i860 XP 
CPU (82495XP/490XP). The reader should be familiar 
with these chips, as described in: 

1) i860 XP CPU Microprocessor Data Sheet (Intel or- 
der #240874) 

2) Intel486 DX Microprocessor Data Sheet (Intel order 
#240440) 

3) 82495XP Cache Controller/82490XP Cache RAM 
Data Sheet (Intel order #240956, June 1991) 

or Intel486 DX CPU Microprocessor Cache-Chip 
Set Data Sheet (Intel order # 241084, June 1991) 

Diagrams of systems containing the 82495 and 82490 
appear in Figure 1, and a more detailed diagram of the 
CPU/82495/82490 core appears in Figure 2. (Note: for 
simplicity, the 82495XP/82490XP and 82495DX/ 
82490DX will be referred to generally as 
82495/82490— the XP or DX should be inferred de- 
pending upon the CPU being utilized.) In such systems, 
the 82495 controls a cache external to the CPU, and 
includes the cache tags. It can interface gluelessly to an 
Intel486 DX CPU or i860 XP CPU microprocessor, 



allowing the processor bus to run at 50 MHz with zero 
wait-states, while the memory bus can remain at a low- 
er frequency. Both writeback and writethrough proto- 
cols are supported. Concurrent operations can occur 
simultaneously on the local CPUbus and the shared 
memory bus. All requisites for multiprocessors are in- 
cluded in the 82495, Intel486 DX CPU, and i860 XP 
CPUs, but the 82495 also is useful for a uniprocessor 
system performance enhancement. 

The 82490 cache RAM contains 32 kBytes per chip, 
and is used in groups of 4, 8, or 16 to implement caches 
from 128 to 512 kBytes. It supports two-way associativ- 
ity, delayed writebacks, burst transfers, and boundary 
scan test. The 82490 contains much more than RAM 
cells— it includes various buffers, queues, and support 
for several bus protocols. It is two-ported, with simulta- 
neous access on both the CPU side and Memory-Bus 
side. The cache optionally supports parity using addi- 
tional 82490 chips. 

Configuration options allow a variety of memory bus 
widths (32 to 144 bits), cache line widths (16 to 128 
bytes), and asynchronous or synchronous transfers. 
The configuration is selected by the polarity of various 
pins at reset time. 
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3. Heterogeneous Multiprocessor 
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Figure 1 . CPU + 82495 + 82490 Systems 
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Figure 2. CPU + 82495 + 82490 Core 



The Memory Bus Controller (MBC) portion of the sys- 
tem interfaces the 82495 and CPU to the system bus. 
The MBC converts bus status and command lines into 
requests to the 82495, for example, to monitor the prog- 
ress of an ongoing bus transaction from another CPU 
subsystem to ensure consistency with 82495 + 82490 
cache contents. Likewise the MBC adapts 82495 re- 
quests to the bus protocol and arbitrates for ownership 
of the bus. Most CPU requests will not require MBC 
action; only I/O cycles, cache bypass requests, and 
82495 cache misses are forwarded by 82495 to the 
MBC, while external cache hits are handled totally by 
82495 + 82490. 



2.0 WHY A CUSTOM BUS 
INTERFACE? 

Clearly the entire interface to a memory bus (abbreviat- 
ed M-bus) could have been incorporated in the 82495 
and 82490 chips. This approach has been followed by 
some other cache chipsets. 

However, such integration suffers from inflexibility and 
bandwidth limitations. As shown in Figure 3, the per- 
formance and cost targets of the system determine the 
size and complexity of the bus, so if the bus is "hard- 
wired" into the cache controller chip, it will be too 
costly for small systems and too slow for larger sys- 
tems. With the bus interface implemented separately, it 
can be a complex ASIC for a high-bandwidth complex 
system, or a few EPLDs for a PC. The same cache 
controller can improve performance of a variety of bus- 
based CPUs. 

For a desktop PC, a 32-bit simple memory bus is ade- 
quate. For a workstation or small multiprocessor of 
two CPUs, a faster 64-bit bus may be required to give 
adequate bandwidth for graphics frame buffers and in- 
tensive numeric calculations. Bus bandwidth require- 
ments grow as the MIPS rating of each CPU in a sys- 
tem grows; for example, a bus adequate for 12 386 
CPUs may be too slow for 6 Intel486 DX CPUs, as 
they process far more data per second. 



A large multiprocessor of 6 or more CPUs needs a wide 
and fast bus such as Futurebus + , with split-transac- 
tion capability to prevent bus bottlenecks from slowing 
the performance of every processor. Hierarchies of bus- 
es and caches can further allow more CPUs with rea- 
sonable performance increases as CPUs are added. A 
Futurebus+ hierarchy maintains concurrent transac- 
tions on each bus, and "bridge" caches at the junctions 
of buses echo them from bus to bus when the bridge 
detects that one transaction may affect cached copies 
on the other bus. 

Compatibility with existing buses is often crucial in 
product design, so that new faster components can plug 
into existing machines and I/O devices. The flexible 
82495/82490 bus interface allows compatibility as well 
as extension. 

Thus the 82495 and 82490 will be used in a wide variety 
of systems, including standard buses like Futurebus + . 
For proprietary buses, the "proprietor" can design an 
ASIC or PAL MBC incorporating the required fea- 
tures. 



3.0 GUIDELINES 

This document exists to clarify the necessary compo- 
nents and tradeoffs of a Memory Bus Controller. The 
example designs here have not been tested, and signal 
definitions of the i860 XP CPU, Intel486 DX CPU, 
82495, and 82490 chips are subject to change. 

The memory bus controller is not allowed to use (and 
thus add capacitance to) any of the CPU pins used by 
the 82495/82490, except those listed in the 82495 Data 
Sheet [82495/490DS] description of the BLE# pin. 
Only the CPU pins BE7-0#, PWT, PCD, LEN, 
CACHE #, BRDY#, PCYC, and CTYP have suffi- 
cient timing margin to tolerate the MBC load. 
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Figure 3. System Type and Bus Requirements 



Shared Bus Interconnect 

When used in a multiprocessor, the 82495 assumes a 
shared-memory, shared-bus environment so that it can 
observe and "snoop" accesses by others which might 
conflict with the memory locations it has cached. In a 
crossbar or other multipath interconnect, shared-bus 
coherency can be emulated for the 82495 or it can be 
used non-coherently. Either a centralized directory or a 
hierarchy of buses and caches can do the emulation. A 
directory would keep a record, for each line of main 
memory, of caches which have the line. When a cache 
first writes to a line of memory, the central directory 
broadcasts an invalidation message to all other caches 
containing that line. [Agarwal88] 



4.0 MBC BLOCK DIAGRAM 

Shown in Figure 4 is a high-level block diagram of the 
functions and interfaces involved in the Memory Bus 
Controller. Part of the MBC operates on the high-speed 
clock (CLK) which the CPU and 82495 use. While the 
M-bus could use the 50 MHz CPU CLK, such a fast 
M-bus is hard to design. The part of the MBC which 
interacts with the memory bus protocol runs on an M- 
bus clock (MCLK), if that protocol is clocked. Also 
possible is an unclocked M-bus protocol using the 
82495/82490 in "strobed" mode. The MBC contains 
synchronizers and a few signals which cross between 
the two clock domains. Synchronizers, consisting of 
specially-designed flip-flops, allow a clocked state ma- 
chine to use data which may be transitioning near the 
edge of the clock. Unsynchronized data can cause 
metastability in latches, where their output changes 
slowly and unpredictably. 
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Figure 4. Generic Block Diagram of MBC 
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5.0 DESIGN EXAMPLE: A 
UNIPROCESSOR MBC 

A simple MBC design example is an adapter to allow 
plugging a daughtercard module with an Intel486 DX 
CPU, 82495, and 4 82490s into an Intel486 DX CPU 
microprocessor PGA socket. The memory bus is an 
Intel486 DX CPU-bus, allowing the external cache to 
be a performance enhancing option. It assumes a "di- 
vided synchronous" M-bus clock, where the M-bus 
runs at Y 2 the CPU CLK speed. Thus no synchronizers 
are needed. The MBC uses both the CPU CLK and the 
M-bus MCLK. 

This design requires 

• 1 74F377 latch 

• 6 PLDs containing 10 state machines 

• 2 chips for clock generation, not part of the MBC 

Approximately 70 signal pins connect the MBC block 
to the CPU, cache, and memory. Only a uniprocessor is 
supported, although the bus protocol and MBC could 
be enhanced for multiprocessing coherency. Figure 5 
shows a block diagram. Details of the design can be 
found in Appendix B. 



6.0 DESIGN EXAMPLE: A 
MULTIPROCESSOR MBC 

An i860 XP CPU multiprocessor-capable MBC (Figure 
6) using an M-bus similar to the i860 XP CPU bus is 
proposed. For clocking, it uses an MCLK of 33 MHz, 
totally asynchronous to the 50 MHz CPU CLK. It 
could therefore be upgraded to faster CPU CLK rates 
in the future without changing the design or M-bus. 

The design requires: 

• 2 74F377 octal latches (for BE7-0#, etc..) 

• 2 74AS4374 dual-rank-synchronizer octal registers 

• 16 PLDs 

2 GA1110 clock drivers for clock distribution 

These components could be integrated into a single 
ASIC chip, as about 120 signals connect to the MBC. 
The MBC can be used for a uniprocessor or multipro- 
i cessor i860 XP CPU design. Details can be found in 
Appendix C. 
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Figure 5. Block Diagram of Uniprocessor MBC 
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Figure 6. Block Diagram of Multiprocessor MBC 
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7.0 MBC FUNCTIONS MBC Functions for Uni and 

^ L , i L . .1 .**.*. r, Multiprocessors 

Table 1 shows the responsibilities of the Memory Bus 

Controller for uniprocessors and multiprocessors (MP). Reset and configuration control includes strapping of 

The multiprocessor features exist mainly to prevent bus the following pins to resistors at Vcc or Ground, or 

over-utilization. However, some of the jobs common to "temporary strapping" of multifunction pins whose 

both are more complex in MP for example, arbitration state during the last 16 clocks before falling edge of 

and snooping. The pin lists in the table are not exhaus- RESET determines 82495, 82490, or CPU configura- 

tive. 

Table 1. Functions of the Memory Bus Controller 

MBC Functions for Uni and Multiprocessors Pins 

1. RESET and Configuration RESET,HOLD,CAHOLD 

2.FLUSH#andSYNC# CAHOLD,FSIOUT#,FLUSH#,SYNC# 

** 3. Bus Error Detection, Retry PCHK#,BERR 

4. CPU transfer tracking (burst count) CLEN1:0,RDYSRC,BRDY# 

5. Mbus transfer tracking (burst count) CRDY#,MBRDY# 

(including writeback, allocation) 

6. Synchronization between clock domains BGT#,CADS#,MBRDY# 
** 7. Memory-bus pipelining BGT#,CNA#,MEOC#,CRDY# 
** 8. MBC-to-82495 pipelining CNA#,MALE 

9. Memory Bus Arbitration BGT# 

lO.Cacheability decode KWEND#,MKEN# 
**1 1 . Redrive bus signals for BTL or ECL levels or heavy capacitive loads 

**12. Packing (convert 32-bit M-bus for 64-bit 82490 size, or 8-bit ROM) MBRDY# 

** 13. Bus messages (interrupts, flushes) * INT(R),FLUSH# 

**14. Boundary scan and selftest TCK,TMS,SLFTST# 

**15. Performance monitoring (M-bus utilization, read vs. write) CW/R#,CADS# 

16. Snoop handshake (snooping DMA or other CPU) SNPSTB#,SNPCLK,SNPCYC# 

17. Snoop writebacks MHITM#,SNPADS# 

Additional MBC Functions for Multiprocessors Pins 

M1. Snoop window (as master) SWEND#, MWB/WT# 

M2. Backoff 82495 when request was to M-line in another 82495 MAOE # 

**M3. Snoop filtering (via SMLN#) SMLN# 

**M4. Cache-to-cache transfers (CTCT) DRCTM#,MBAOE# 
**M5. Read-For-Ownership (RFO) PALLC# I DRCTM#,MFRZ# 

* * M6. Split transactions (requires duplicate tag array) CWAY 

**M7. Memory cycle abort (after MHITM#) MHITM# • 

M8. LOCK# protection KLOCK#,CAHOLD,SNPCYC# 

**M9. LOCK# de-assertion (for back-to-back Intel486 DX CPU locks) KLOCK# 

**M10. CPLOCK# (Intel486 DX CPU only) CPLOCK# 

**M11. Snoop during LOCK# KLOCK# 

**M12. Multiprocessor Interrupts INT,NMI(BERR) 
(for Message-Based Interrupts or TLB shootdown) 

** = optional and implementation dependent 
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tion. The circuit feeding RESET to these chips should 
keep it active at least 16 CLK periods. "Temporary 
strapping" means including RESET or A RESET in 
the logic equation for the pin. The multifunction pins 
are indicated with brackets [ ] below: 

i860 XP CPU pins: 

PEN#, FLINE#, HOLD 
Intel486 DX CPU pins: 

RDY#, BOFF#, BS8#, BS16#, HOLD, FLUSH 

82495 pins: 

CFG3, CFG2[KWEND#], CFG1 [SWEND#], 
CFGO [CNA#], CPUTYP[HITM#], 
FPFLDEN [FPFLD # ] , NCPFLD # [FLUSH # ] , 
SNPMD[SNPCLK], C490LDRV [BGT#], 
MEMLDRV[SYNC#], SLFTST#[CRDY#], TEST, 
HIGHZ# [MBALE], CACHE# (NOTE: the 
FPFLDEN pin is defined for Intel486 DX CPU as 
PLOCKEN[CPLOCK#]. The 82495XP does not use 
CFG3 for configuration in i860 XP CPU systems.) 

82490 pins: 

MTR4/TR8 # [MSEL # ] , MX4/MX8 # [MZBT # ] , 
MSTBM[MCLK], MEMLDRV[MFRZ#] PAR#, 
MOCLK, (BOFF # , HITM # ) 

Intel486 DX CPU: The "unused" Intel486 DX CPU 
inputs (RDY#, BS8#, BS16#, BOFF#) with 82495 
should be connected as described in the Intel486 DX 
CPU Chipset EDS. 

The Intel486 DX CPU FLUSH # input should be tied 
up, unless the system requires FLUSH messages from 
the M-bus to be interpreted. Then the MBC must assert 
the FLUSH # inputs to both Intel486 DX CPU and 
82495, because 82495 does not do back-invalidates to 
the Intel486 DX CPU for FLUSH #. During RESET, 
the Intel486 DX CPU FLUSH # input must be kept 
high to avoid putting the CPU in tristate-output-test- 
mode (Intel486 DX CPU Data Sheet Section 8.4). 

i860 XP CPU: The i860 XP CPU input PEN# (Parity 
trap ENable) must be strapped high unless the memory 
data bus feeding the 82490s always contains good pari- 
ty and the i860 XP CPU system uses 2 82490s in parity 
mode; in the latter case, strap PEN # low. HOLD 
should be strapped low and FLINE# strapped high, as 
those features cannot be used with 82495. 

82495: The multiplexed 82495 pin FPFLDEN 
[FPFLD #] becomes an output after RESET, so the 
PAL or ASIC which creates FPFLDEN must float it 
as soon as RESET = 0. The same multiplexing applies 
to Intel486 DX CPU mode, where the pin is named 
PLOCKEN[CPLOCK#]. Likewise, the multiplexed 



input FLUSH # [NCPFLD #] should be driven high 
the same clock that RESET falls, to prevent an unnec- 
essary 82495 cache Hush. In Intel486 DX CPU sys- 
tems, the 82495 input CACHE # must be tied low and 
HITM#[CPUTYP] must be tied LOW, as it signals 
CPUTYPE to 82495. 

82490: The 82490DX inputs HITM# and BOFF# 
must be tied high in an Intel486 DX CPU system, as 
they exist to support the i860 XP CPU writeback 
cache. With an i860 XP CPU, the 82490XP input 
BOFF# comes from 82495XP but HITM# from i860 
XP CPU feeds 82495XP and 82490XP. 

The 82490 input MOCLK must also be tied low or to a 
delayed version of MCLK, if clocked-M-bus mode is 
used. This is because the 82490 senses the state of 
MOCLK after RESET ends— if MOCLK stays low, 
the 82490 uses MCLK to drive MDATA. If MOCLK 
toggles after RESET, the 82490 will use MOCLK to 
switch output data. Using a delay-line externally to the 
82490 to generate MOCLK from MCLK allows the 
design a longer hold-time at other receivers of MDA- 
TA in the system. For a clocked-M-bus (non- synchro- 
nous to CLK), the undelayed MCLK should be con- 
nected to the 82495's SNPCLK input and should be 
toggling during RESET to tell the 82495 to snoop in 
clocked mode. 

During RESET, the 82495 and 82490 will float the bi- 
directional lines they share with the CPU, such as 
CD ATA and A31.-A3. Thus driver contention is avoid- 
ed. The RESET input should be synchronous to CLK 
and deasserted to the 82495, 82490s, and CPU at the 
same time, to assure that the configuration controls get 
properly passed between them. 

For Intel486 DX CPU resets, refer to [82495/490DS] 
for the sequencing of HOLD, HLDA, CAHOLD, and 
RESET required to reset only the processor without 
destroying 82495 cache contents. For that purpose, a 
separate RESET line is advised for the CPU and 
82495/82490. The CPU RESET line must be wired to 
the WRMRST input of 82495, to force 82495 to assert 
the BRDY1 # input to the CPU during a reset of CPU- 
only (the CPU uses the BRDY1 # input during RESET 
to know of the 82495's existance). The HOLD input of 
the Intel486 DX CPU and i860 XP CPU processors 
should be kept low during normal operation with the 
82495, because floating the processor outputs may yield 
undefined 82495 behavior. 

FLUSH# (and SYNC#) of caches requested by soft- 
ware must be decoded from the 82495 outputs CM/ 
IO#, CD/C#, and CW/R# ( = 001) and latched 
BE3-0# from the CPU. BE3-0# values of 0111 or 
1101 should activate the 82495 FLUSH # input, as the 
Intel486 DX CPU outputs them in response to the 
INVD and WBINVD instructions, respectively. Synch 
and flush commands may also come from the bus as a 
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message in a multiprocessor system. The 82495 is smart 
enough to allow assertion of FLUSH # or SYNC# at 
any time, and will delay the beginning of the flushing 
action until all current CPU and M-bus cycles have 
completed. The inputs are edge-sensitive. If the bus de- 
fines cache flush messages, the MBC may activate the 
Intel486 DX CPU FLUSH # input as well as the 
82495's in response to bus message decodes. 

Bus Error or Timeout Detection logic in the MBC can 
use the CPU's PCHK# output or other M-bus-specific 
signals to detect errors. Note that the assertion of 
PCHK# will occur near the time of the error on the 
M-bus ONLY for non-cacheable reads or 82495-cache- 
miss reads. For 82495-hits and CPU-idle cycles, 
PCHK# may arise due to a floating or erroneous CPU 
data bus value transferred on the M-bus much earlier. 
PCHK# must be ignored by the MBC except during 
the CLK after data transfer to the CPU was signalled 
by the MBCs CPU BRDY#, because PCHK# indi- 
cates i860 XP CPU bus parity status at all times, not 
just during clocks of BRDY# activation. The proces- 
sor inputs INT, BERR, or NMI can be asserted by the 
MBC to signal errors. To detect errors originating in 
the CPU or 82490 upon a writeback), the MBC can 
check parity on the 82490 MDATA pins or on the M- 
bus. 

If the memory bus includes a retry protocol, the MBC 
bears the responsibility to implement it, because the 
82495 will not retry accesses. For a pipelined MBC in- 
terface when the retry occurs after CNA# to the 
82495, the MBC must latch Jhe address and other con- 
trols (CW/R#, CM/IO#, etc..) from the 82495 to use 
in retries. Retry should be triggered by signals other 
than the CPU PCHK# output, because the CPU data 
transfer cannot be retried although the M-bus transfer 
can. 

The 82490 can restart a burst data transfer (for the case 
of an error detected after the first MBRDY# but be- 
fore MEOC# and before CRDY#). To restart the 
82490, the MBC must deassert MSEL# for at least 1 
MCLK. 

While parity is supported by the 82495 and 82490, 
ECC (Error Correcting Codes) cannot conveniently be 
used within the cache. ECC can be implemented on the 
memory system, but no loads are permitted on the 
CPU-to-82495/82490 interface wires for error checking 
logic. 

Scenarios requiring MBC action are 

1) CPU based requests ("Master" mode): 

• 82495 cache read miss (and line fill) 

• 82495 cache write miss 

• Non-cacheable CPU read (including i860 XP CPU 
pfld) 

•Writethrough (to S-state line) or Non-cacheable 
CPU write 



• I/O reads and writes 

• LOCKed reads and writes (will be readthrough or 
writethrough) 

2) 82495 based requests ("Master" mode): 

• Allocation due to write-miss (line fill) 

• Replacement writebacks 

• SNPADS# writebacks 

3) Requests from other masters ("Slave" mode): 

• Snooping of DMA accesses 

• Snooping of accesses of other CPUs (in a multipro- 
cessor) 

• Bus-specific requests, like interrupt messages, reset 
requests, cache flushes, configuration registers, ID 
registers, timeout detection, acknowledgements, 
TLB shootdown 



Transfer Tracking 

Tracking of transfers on the M-bus and CPUbus is re- 
quired of the MBC during all of the above scenarios. 
This tracking (counting) of transfers involves activating 
BRDY# the correct number of times for the CPU and 
MBRDY# (a possibly different number) for the 82495 
and 82490. Transactions on the CPUbus which must be 
MBC-controlled can be 1, 2, or 4 data transfers, decod- 
ed from the BLE# -latched CPU pins: 

Intel486 DX CPU: BE3-0#, PWT, PCD 
i860 XP CPU: BE7-0#, PWT, PCD, LEN, 
CACHE # 

and from the 82495 pins CW/R#, MCACHE#, 
RDYSRC (and CLENhCLENO for Intel486 DX CPU 
mode). 

See [82495/490DS] for a complete definition of the en- 
codings. The BRDY# activations must be done only if 
RDYSRC = 1, and always correspond to the first 1, 2, 
or 4 MBRDY#s for the 82490-M-bus interface. The 
number of MBRDY#s always exceeds or is equal to 
the number of BRDY#s, even for a 128-bit M-bus. 

Bursts for line fills and writebacks on the CPUbus al- 
ways are 4 transfers, but with some 82495 configura- 
tions the M-bus is 8 transfers. The addresses are nonse- 
quential when the first access is not at the zeroth word 
of the line. The addresses corresponding to each 
BRDY# and MBRDY# follow these rules: 

1) CPU burst addresses wrap at CPU line length. 

2) When the line address is odd (A2= 1 for 4-byte bus; 
A3= 1 for 8-byte bus; A4= 1 for 16-byte M-bus), the 
next address transferred on CD ATA and MDATA 
is the LOWER address (eg., 3 followed by 2). The 
odd-first-then-even pattern continues for all transfers 
of the burst. This order optimizes interleaved 
DRAM systems, and applies to both the M-bus and 
CPUbus. 
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3) 82490 bursts on CD ATA wrap at CPU line length. 
82490 MDATA burst addresses wrap at 82490 line 
length. For example, a linefill with LR = 4 and a first 
Intel486 DX CPU address (A5:A2) = E, 



82490 CD ATA ordering is E F C D 

82490 MDATA ordering is CDEF 89AB 4567 0123 
(128-bit M-bus) OR EF CD AB 89 67 45 23 01 (64- 
bit M-bus) 



For LR = 2 (Line Ratio of 82495 to CPU) and CPUbus width = M-bus, below are the burst orders. Each address 
corresponds to one 4-byte transfer (for Intel486 DX CPUs) or 8-bytes (for i860 XP CPU). Time is increasing left-to- 
right: 

First Address: First Address: 1 

CPU transfers: 12 3 10 3 2 

M-bus transfers: 01234567 10 3 25476 



First Address: 2 

CPU transfers: 2 3 1 

M-bus transfers: 2 3 1 6 7 4 5 

First Address: 4 

CPU transfers: 4 5 6 7 

M-bus transfers: 4 5 6 7 1 2 3 



First Address: 3 
3 2 10 
3 2 10 7 6 5 4 

First Address: 5 

5 47 6 

5 4 7 6 1 3 2 




First Address: 6 

CPU transfers: 6 7 4 5 

M-bus transfers: 6 7 4 5 2 3 1 



First Address: 7 

7 6 5 4 

7 6 5 4 3 2 10 



For LR = 2 and M-bus = 2* CPUbus width (both buses using 4 transfers), 

First Address: First Address: 1 

CPU transfers: 12 3 10 3 2 

M-bus transfers: 01 23 45 67 01 23 45 67 



First Address: 2 

CPU transfers: 2 3 1 

M-bus transfers: 23 01 67 45 



First Address: 3 

3 2 10 

23 01 67 45 



First Address: 4 

CPU transfers: 4 5 6 7 

M-bus transfers: 45 67 01 



23 



First Address: 

5 4 7 6 

45 67 01 23 



First Address: 6 

CPU transfers: 6 7 4 5 

M-bus Transfers: 67 45 23 01 

The remaining transfer orderings for other LR values 
can be generated similarly, as an exercise for the reader. 

For requests originated by the 82495, the MBC must 
ignore the CPU pins (CACHE#, LEN, PWT, PCD, 
PCYC, CTYP, and BE7#-BE0#). These requests are 
writebacks, allocations, or linefills. Also the MBC must 
prevent the transfer of those signals to the M-bus for 
82495 requests — for example, it must force all BE7#- 
BE0# active during writebacks. The 82495 based re- 
quests can be recognized by: 

RDYSRC = .AND. MCACHE#=0 (for write- 
backs, linefills, allocations) 

RDYSRC = .AND. MCACHE#=0 .AND. 

MKEN# =0 (for linefills, allocations) 



First Address: 7 
7 6 5 4 
67 45 23 01 

For posted write requests (RDYSRC = and 
MCACHE# = 1), the length is 1, 2, or 4 transfers and 
the MBC must heed the BLE# -latched BE7-0#, 
LEN, and CACHE#. 



Clock Boundaries and Synchronization 

To optimize performance, the 82495/82490 allow to- 
tal/decoupling of the CPU clock at 50 MHz from the 
M-bus clock. While both the CPU and M-bus could 
run at 50 MHz, the physical size of the M-bus would be 
severely constrained. Future faster versions of CPU and 
82495/82490 would make a synchronous M-bus even 
less feasible. However, with a 100% synchronous inter- 
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face, little time is lost in relaying requests from the 
82495 CADS # to the M-bus, and in transfering data 
from the M-bus to the CPUbus. , 

Yet with careful design, a slower M-bus such as 
33 MHz can handshake with a 50 MHz 82495 with 
only a couple of clocks spent on synchronizing. Fur- 
thermore, the transfers requiring synchronizing are 
fairly rare uncached cycles, cache misses, and snooping. 
CPU performance is improved further because 
82495/82490 always post writes destined for the 
M-bus, allowing the CPU to continue processing upon 
write cache-misses and non-cacheable writes. 

Most of the 82495 operates on the CPU CLK. Only the 
snooping control inputs operate on another clock, 
called SNPCLK (SNPSTB#, SNPINV, SNPNCA). 
SNPCLK can be the same as the MCLK controlling 
82490 MDATA. A SNPCLK can be used with 82495, 
even if the 82490 is strobed without an MCLK. All 
82495 outputs, including snooping results (MHITM#, 
MTHIT#, SNPCYC#, and SNPBSY#) remain on 
the CPU CLK. 

The 82490 operates half in the CPU CLK domain and 
half in the M-bus domain. While no control signals flow 
through 82490 between memory and the CPU, 82490 
implements a flow- through data connection of CDA- 
TA to MDATA. Synchronization of the 2 DATA 
paths is unneeded, as the control signal MBRDY# gets 
synched by the MBC to the CPU clocked BRDY#. 
The MBRDY# and BRDY# : inputs control multiplex- 
ers inside 82490 to choose which part of a line-fill or 
write is transferred to/from the bus. The MDATA in- 
put latches are closed on MCLK (or MISTB for non- 
clocked operation), and CDATA input latches are 
closed with CLK. 



If MCLK = CLK at 50 MHz, approximately 1.5 CLK 
periods are required to transfer data through the 82490, 
including 82490 propogation delay (15 ns) and setup 
time to both the 82490 (5 ns) and CPU (7 ns for i860 
XP CPU "CMOS" levels). The MBC must assure data 
setup time at the CPU D0-D31 (D63) pins to the ris- 
ing edge of CLK for the cycle of BRDY# assertion 
during reads, based on the propogation delay from 
MDATA to CDATA listed in the 82490 AC timing 
specs. Writes are not flow-through, as 82490 always 
buffers the write-data and later 82495 gives CDTS# for 
the write. 

Most of the MBC-to-82490 signals are sampled by 
82490 with MCLK, except for BRDY# and CRDY#: 



82490 Signals 

CLK 

BRDY# 
CRDY# 

CDATA 



MBC « 
MCLK 

MBRDY# 

MFRZ# 

MZBT# 

MDATA 

MSEL# 

MEOC# 

MDOE# (asynchronous to both clocks) 

The MBC must be partitioned into an MCLK side and 
a CLK side. Fortunately, the CPU-side of MBC passes 
only a few signals to the MCLK side, and visa versa. 
The signals listed below from the dual-i860 XP CPU 
MBC design in Appendix C must go through a syn- 
chronizer. Refer to the Appendix for signal definitions. 
In the following diagram, a right-arrow ( — > ) identi- 
fies synchronizing to CLK, while a left-arrow ( ■<— ) 
means synchronizers on MCLK: 



Clock Domain of the Signal: 



MCLK or SNPCLK 

MRESET 

YBGT# 

YMEOC# 

YCEOC# 

MBRDY# 

MSWEND# 

MADS# 



Neither 



•MSWENDA- 



CLK 

RESET 

BGT# 

CRDY# 

BRDY#_maybe 

BRDY#__maybe 

SWEND# 

CADS# .or. SNPADS# .or. CDTS# 



The signals MKWEND# and MNA# might also need synchronizing to CLK, if they are derived from M-bus 
responses. 
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Two TI 74AS4374 "Dual-Rank Synchronizer" chips 
(Figure 7) are used to transfer critical signals between 
clock domains, while avoiding metastability. This 20- 
pin DIP has one clock input and 8 pairs of flip-flops. 
Thus each of the 8 "Q" outputs reflects the value of its 
"D" input after 2 clock periods. One chip is clocked by 
CLK and the other by MCLK. If fewer than 8 signals 
need synchronizing, chips such as the Signetics 
74F50728 or Intel's 85C220 EPLD can combine syn- 
chronization with other functions [Ham90] . 

For an asynchronous or strobed memory bus, M-bus 
signals (such as MBRDY#) get delayed by the syn- 
chronizer for 2 CLK periods before the 82495 can see 
them. For a clocked (but not by CLK) M-bus, 82495 
outputs (such as CADS#) get delayed by 2 MCLKs by 
the other synchronizer before the M-bus sees them. 

The following 82495 signals are defined as "asynchro- 
nous", meaning that no external synchronizer is re- 
quired: 

o FLUSH #, SYNC# 

o MALE, MBALE 

o MAOE#, MBAOE# 

Many signals can cross clock boundaries without syn- 
chronizing, because they will be ignored until corre- 



sponding status signals such as SWEND# and 
CADS# have been synchronized by the MBC. Thus 
they will be stable when sampled: 

• MWB/WT#, DRCTM#, MTHIT#, MHITM# 

(sampled when S WEND # ) 

o RDYSRC, KLOCK#, CPLOCK#, CW/R#, 
CD/C#, CM/IO#, MCACHE#, BE7:# (sampled 
when CADS#) 

Other signals do not cross clock boundaries, but remain 
within the MBC CLK logic: 

© CNA#, PALLC#, CACHE#, LEN, PCD, PWT, 
CTYP, PCYC, MFRZ# ... 



Synchronizer Delays 

To avoid lost time due to synchronizer delays, the fol- 
lowing options exist: 

1. Pipeline the 82495/MBC interface. This hides the 
delay in synchronizing CADS# to its MCLK coun- 
terpart MADS # . 

2. Define the M-bus protocol so that MBRDY# pre- 
cedes MDATA by 1 MCLK for reads. Thus the 2 
CLK delay in creating BRDY# from MBRDY# is 
hidden. Likewise define MS WEND # to precede 
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Figure 7. Synchronizer Hardware and Waveforms 
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MHITM# and MTHIT# by a CLK, by generating 
MSWEND# from SNPCYC#. 

.3. Keep the snooping signals (S WEND #, MHITM#, 
MTHIT#, SNPINV, SNPCYC#) which flow be- 
tween 82495s on the same CLK, so that no synchro- 
nizers enter the snoop path. This is feasible only for a 
small number of physically proximate CPUs. 

4. Synchronize the snooping feedback signals from the 
M-bus (MSWEND#, etc..) only at the destination. 
They will be asynchronous to MCLK, transitioning 
with the individual CLK of their source. 

5. Avoid MCLK, using a strobed-only M-bus. Strobed 
buses appear in single-CPU systems with an un- 
clocked DRAM interface. 

6. Activate MEOC# to 82490 as soon as possible after 
the last MBRDY#. MEOC# allows 82490 to begin 
the next data transfer without waiting for CRDY# 
synchronization. 



BRDY# Generation 

Below are recommended sequences of the 82490 and 
CPU burst-transfer "Readys" for CPU reads, assuming 
the bus widths are equal. Sequences with more clocks of 
delay are acceptable but suboptimal. 

1) Synchronous M-bus (MCLK = CLK): MBRDY# 
precedes BRDY# by 1 or 2 CLKs, to allow propo- 
gation time for data through the 82490 and setup 
time at the CPU pins. 

2) "Divided Synchronous" M-bus (e.g., CLK = 50 
MHz, MCLK = 25 MHz, skew controlled): 
MBRDY# precedes BRDY# by 1 or 2 CLKs. The 
BRDY# state machine must ignore MBRDY# in 
the CLK period after it was sampled active. 

3) Other Clocked M-bus (MCLK < CLK): 
MBRDY# must go through a dual-rank synchroni- 
zer latch (such as the TI 74AS4374) clocked by 
CLK to produce BRDY#. That means 2 CLK de- 
lays between MBRDY# and BRDY#. MBRDY# 
MUST remain active for at least 1 CLK period to 
assure that the synchronizer latched it active. To 
avoid one MBRDY# getting wrongly sampled ac- 
tive twice, the BRDY# state machine should ignore 
any second MBRDY# in the CLK period after it 
was sampled active. 

4) Strobed M-bus: here MISTB# must go through the 
synchronizer with 2 CLK delays to create BRDY#. 
An edge-sensitive strobed M-bus avoids the problem 
of wrongly converting one M-bus transfer to 2 
BRDY#s, as a level-change marks each M-bus 
transfer. 



When M-bus width is greater than CPUbus width, the 
above rule holds only for the first BRDY#. Successive 
BRDY# activations follow the rules below: 

• M-bus = 2*CPUbus: 2 BRDY#s occur for each of 
the first 2 MBRDY#s. The second BRDY# should 
occur 1 CLK after the first. The third BRDY # can- 
not begin until after the second MBRDY#. 

• M-bus = 4*CPUbus: 4 BRDY#s occur for the 
MBRDY#. The last 3 BRDY#s can occur immedi- 
ately in the 3 CLKs after the first BRDY#. 

For asynchronous systems (MCLK < CLK), high per- 
formance design choices are: 

M-bus width = 2 * CPUbus width OR 
M-bus width = 4 * CPUbus width 

The wider M-bus allows each M-bus transfer to satisfy 
2 or 4 CPU transfers, so that the CPU is not starved for 
data during a line fill. The 82490 switches its CD AT A 
outputs to the next value the CLK after BRDY# asser- 
tion by the MBC for the current value, so the MBC 
controls the provision of data to the CPU on linefills. 

A low-cost MBC can use M-bus width = CPUbus with 
a slower MCLK, by converting the first MBRDY# to 
BRDY# through a synchronizer. The last 3 BRDY#s 
can be asserted by MBC after completion of all the M- 
bus transfers. That will allow the CPU to proceed exe- 
cuting after receiving the first datum, which is the one 
it was waiting for in most cases. Alternatively, the M- 
bus protocol can be defined so that no idle clocks occur 
on M-bus after the first MBRDY# and the MBC 
knows by counting CLKs when to assert successive 
BRDY#s. 

Shown in the following timing diagrams are data trans- 
fers on both buses for CPU reads. Although they as- 
sume no dead clocks (wait states) during the M-bus 
burst, dead clocks are allowable. 

Writes are not shown in the diagrams because the MBC 
never supplies the CPU BRDY#s for burst writes. 
RDYSRC = for most writes, and the 82495 controls 
the CPUbus transfers. The exception to this rule is I/O 
writes, which 82495 does not post; for I/O writes, the 
MBC supplies BRDY# to the CPU, but I/O accesses 
are always 1 non-bursting transfer. 
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Figure 8. Data Transfers, M-bus Width 
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Figure 9. Data Transfers, M-bus Width = CPUbus Width. CLK/2 < MCLK < CLK. 
Note the starvation on the CPUbus (extra wait state) 
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Figure 10. Data Transfers, M-bus Width = 2*CPUbus. CLK/2 < MCLK < CLK 
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Figure 11. Data Transfers, M-bus Width = 4*CPUbus 
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Pipelining 

Pipelining the MBC-to-82495 interface reduces latency 
by allowing the MBC to arbitrate for the next M-bus 
transaction while the first is proceeding. If the M-bus is 
also pipelined, it allows the snoop for the next to begin 
during the data transfer for the first. 

Signals used in pipelining the 82495 are CNA#, 
BGT#, MALE, KWEND#, SWEND#, and 
CDTS#. The 82495 will not listen to CNA# until the 
clock of BGT# activation. Also, KWEND# activation 
sometimes allows the 82495 to create a next cycle, such 
as an allocation after a write miss. MALE deassertion 
allows the memory address to remain at the value for a 
previous request, even though the next request CADS# 
and other control signals have already occurred in re- 
sponse to CNA#. The MBC must latch the 82495 out- 
put signals which change in response to CNA#, until 
their status no longer matters to ongoing cycles. 

Note that 82495 and 82490 automatically pipeline the 
CPUbus interface to i860 XP CPU by activating NA# 
and latching address and data. 

Pipelining the M-bus itself involves sending a next ad- 
dress for snooping and DRAM access while data trans- 
fer from the current address still remains incomplete. 
This increases bandwidth by overlapping slow DRAM 



access with bus data and address transfers, as in the 
i860 XP CPU pipelined bus. 

While each 82495 allows only a one-stage deep pipeline, 
the M-bus can have a deeper pipe as requests from sev- 
eral different 82495s can be in progress. The number of 
stages in the M-bus pipe should match memory access 
latency. For example, use .3 stages for a 240 ns mem- 
ory with a 120 ns bus MADS#-to-MNA# (and 
SWEND#) time, so that a second and third request get 
issued during the memory latency of the first. Pipelin- 
ing does not imply that multiple snoops are ongoing 
waiting for SWEND#; that is a split-transaction bus, 
defined in a later section. Thus a quick S WEND # 
turnaround time speeds a new request onto the M-bus. 

The advantage of a pipelined bus using a 4-transfer 
burst is illustrated in Figures 12 and 13. Assumed is a 
fast memory access time of 4 MCLKs. With a slower 
access time, pipelining becomes more important for 
maintaining data bus bandwidth; even with the 
4-MCLK access, the unpipelined data bus is idle 50% 
of the time. 



M-bus Arbitration 

If the M-bus possesses more than one master, each 
MBC must arbitrate to gain control of the M-bus when- 



MCLK 


- v 


h^ 


3 


4 


5 


6 

SOM00 


7 

500000 


8 

500000 




^ 


11 


12 


13 


MDATA 


HTA, X A2 x A 

. X Y 2 Y . 


ff° 


pOl_X ^2 X B 


* m 


MBRDY# 


i r^ 








\< 


L/ 


















DOO^OOOOO 1 


50000( 


MADDRESS 


A 








~X~B 






















uH 










MADS# 










\_ 

















240957-12 



Figure 12. Data Transfers for Non-Pipelined M-bus. Note low MDATA Bandwidth. 
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Figure 13. Data Transfers for Pipelined M-bus 
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ever its 82495 activates CADS#. No arbitration logic is 
included in 82495 nor 82490, except for the ability to 
float. (Hi-Impedance) the 82495 and 82490 M-bus out- 
puts via the MAOE# and MDOE# signals. The 
BGT# and MAOE# inputs to 82495 are from MBC 
arbitration logic. The simplest systems can use a 
HOLD/HLDA/BREQ protocol like the i860 XP CPU 
and Intel486 DX CPUs themselves, which is central- 
ized arbitration. 

Expandible buses like Futurebus + and Multibus-II use 
distributed arbitration to allow a variable number of 
masters. Bus parking (retaining ownership of the M-bus 
until another master requests it) is advised to avoid un- 
necessary delay. 

The "restricted backoff protocol" of 82495 requires 
that it be granted the bus for a modified-line writeback 
after it activates MHITM#, before it will snoop or ini- 
tiate any other transactions. The snooping MBC must 
relinquish the M-bus immediately after the CRD Y # of 
the M-line writeback so that the original owner can 
complete its work. 



Sequencing 

A typical sequence of request and response signals be- 
tween the 82495 and MBC is shown in Figure 14. The 
"SL" entities (CPU S l, 82495 S l> 82490 S l> MBC S l) are 
for another CPU/Cache core, the SLave(s) who snoop 
when the master CPU owns the bus. No DMA (such as 
EISA or MCA) interaction is shown, but it will be simi- 
lar to the CPU responses, except that no writeback will 
be done by DMA. Time increases downward. A minus- 
sign prefix means deassertion. 

The arbitration for the M-bus shown in the diagram 
assumes a HOLD/HLDA protocol like the CPUs use. 
That is a primitive centralized scheme, suitable only for 
a small number of processors. 

The sequencing may vary from that shown; for exam- 
ple, MSEL# may precede CDTS#. MADS#, 
MW/R#, MA31:3, MM/IO#, MD/C#, and 
MBE7-0# would all be valid simultaneously. The sig- 
nals in parentheses would be asserted only in the case of 
a M-line hit in the snooper, and some signals for that 
writeback and possible cache-to-cache transfer are not 
shown. 
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ADS# 


CADS# > 


MBREQ > 


MHOLD > 


-MA0E# > 








< -MHOLD 


< MHLDA 








CPU 82490 82495 MBC M-BUS MBC^ 82495 SL 82490 SL CPU SL 








< MA0E# 


MADS* > 


> 


> 








< MALE 


MA31:3 > 


> 




MW/R#,etc > 


> 




SNPSTB# > 


<MKEN# MR0# 


< 




SNPINV# > 
< SNPBSY#* 


< KWEND# 


< BGT# CNA#* 


CDTS#* > 


< SNPCYC# 


< MS 


EL#*. MD0E# 




< MHITM# 




< MWB/WT# 


< 




< MTHIT# 


DRCTM#* 








CPU 82490 82495 MBC M-BUS MBC^ 82495 SL 82490 SL CPU SL 




•• 
«• 

0* 




< (-MA0E#) 


< 


< (MADS#) 


< (SNPADS#) 






0* 

«• 

•• 


< H 


<D0E#) (MSEL#) 




(MBRDY#) > 


> 










< 


< 




< MBRDY# 










< 


< SWEND# 


< MSWE 


ND#* 


< BRDY# 








< MEOC# 


< 


< CRDY# 




< -MA0E# 




CPU 82490 82495 MBC M-BUS MBCs L 82495 SL 82490 SL CPU SL 


* = Signal might occur sooner or not at all, depending on the type of request and bus protocol. 
** = These lines of the sequence occur only on a Hit-to-Modified (MHITM#) 



Figure 14. MBC Signals and Protocol Layers 
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Flowchart of MBC Algorithm (not applicable to all cases) 



i 

M- 

[7 



M-bus already owned by this MBC ? 
Y 



Arbitrate for bus. 



1 

►Enc 
Ech 

i 



H> Enable 82495 to drive address to bus (MAOE#, MALE). 

Echo other request parameters (MW/R#, MCACHE#, etc...) to the bus. 



► Assert BGT#. 



1 



Dotermino cacheability, assort pins KWEND#, MKEN#, MRO#. 
Latch control signals (MW/R#,etc...). 
Assort CNA# to invoke next 82495 request. 



MHITM# from other masters ? 
Nl |Y 

Abort Memory cycle. Do Cache-to-cache transfer. 

1 

- V/ait for CDTS# (before beginning data transfer). 

1 

Forward snoop responses to master 82495 
using SWEND#, MWB/WT#, DRCTM#. 



r 



Signal burst transfers of M-bus via MBRDY#. 

If RDYSRC = 1,echo burst transfer acknowledgments on BRDY#. 

Compensate for LR<>1 by stopping BRDY# assertion when CPU line filled. 



Notify 82495 and 82490 of completion of transfer via ME0C# and CRDY#. 



New CADS# ? 
IN 



Relinquish bus ownership. 

Deassert MA0E# to re-enable snooping by this 82495. 
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Cacheability of each request must be determined by the 
MBC to prevent the 82495 and CPU from caching 
things like memory-mapped I/O device registers. The 
i860 XP CPU CPU samples its KEN# (Cache ENable) 
pin at the time of the first BRDY# for a transfer or at 
NA#, whichever comes first. The 82495 offers more 
flexibility than the CPU cacheability indicators, by us- 
ing the KWEND# (cacheability Windown END) in- 
put to indicate validity of the MKEN# and MRO# 
pins. The values of MKEN# and MRO# are based on 
address decode, either locally in the MBC or from a 
centralized decoder on the memory bus. For best per- 
formance, KWEND# should come as soon as possible, 
as it allows 82495 to decide what the next CADS# 
should be — for example, to begin an allocation for a 
write miss, or to start another writethrough. 

A typical implementation would activate KWEND# 2 
clocks after CADS#, using a PLD or fast SRAM to 
decode the upper bits of the address to generate 
MKEN# andMRO#. 

Note that KWEND#, SWEND#, and BGT# need 
not be asserted by the MBC for SNPADS# cycles 
(snoop writebacks), but it may be simpler to assert 
them always. 



Snooping 

Snoop handshaking (bus watching) is useful in a multi- 
processor system, and may be needed in a uniprocessor 
system where the 82495 and CPU caches must be kept 
consistent with DMA accesses. The 82495 must snoop 
all DMA accesses to memory. The MBC sees requests 
from DMA (or other processors) on M-bus and con- 
verts them to SNPSTB# activations to the 82495. The 
following scenarios are possible: 

• DMA (or other processor) read causes 82495 
MHITM#: 82495/82490 must writeback the modi- 
fied line to memory before the first DMA data 
transfer occurs (unless the DMA controller is capa- 
ble of re-trying the read. If the DMA can retry, then 
the 82495 writeback must cause the initial DMA 
access to be aborted.) The MBC can assert 
SNPNCA (SNooP Non-CAcheable Access) to the 
82495 for a DMA read, so that the 82495 knows it 
can keep the block Exclusive upon a hit. 

• DMA (or other) read causes 82495 MTHIT# but 
not MHITM#: MBC must assert the "shared" 
status line of the M-bus, if the bus includes such a 
line. 

• DMA (or other) write causes 82495 MHITM#: 
82495/82490 must writeback the modified line to 
memory before the first DMA data transfer occurs. 
SNPINV should be activated to 82495 to invalidate 
the line. 

• DMA (or other) write causes 82495 MTHIT# but 
not MHITM#: SNPINV should be activated to 



82495 to invalidate the line. Note that 82495/82490 
cannot "write snarf" — they do not absorb write-data 
from the memory bus and merge it with current cached 
contents of the line. However, they can absorb a full- 
line writeback from the M-bus when doing a linefill of 
the same address (see the section on Cache-to-Cache- 
Transfers). 

Bus size adaptation can be done by the MBC, although 
it is not necessary in most systems. In an Intel486 DX 
CPU or i860 XP CPU system without an 82495/82490, 
an 8-bit device like a ROM can be used to contain code, 
and the CPU will automatically fetch at byte-width 
when the BS8# (Intel486 DX CPU) or CS8 (i860 XP 
CPU) pin is asserted. However, if a byte- wide ROM is 
used with an 82495/82490, adaptation of this byte in- 
terface is required from the MBC. 

If the ROM code is to be cacheable, the MBC must 
convert the 82495 line fetches at the ROM location to 
the appropriate number of byte-wide ROM reads. 
Latching transceivers must be employed at the 82490 
MDATA inputs or at the ROM output, to assemble the 
single-byte ROM reads into 4 (or 8) bus-width-wide 
transfers to the 82490s. 

If the particular M-bus protocol requires transfer 
widths shorter than the 82490 data width used, the ad- 
dress range requiring such transfers can be made non- 
cacheable to force 82495 and 82490 to use the width 
given in the request from the CPU. 

Bus size adaptation would also be needed to support a 
512kB cache on a 32-bit memory bus. In that case, the 
MBC must control transceivers and MBRDY#s to in- 
terface between the 64-bit 82490 MDATA path and the 
32-bit M-bus. 



Bus Signal Levels 

Redriving 82495/82490 signals to the M-bus (such as 
MDATA, addresses, and 82495 control outputs) can 
optionally be done by the MBC. If, the M-bus signal 
levels are not TTL, like ECL or Futurebus + BTL 
(Backplane Transceiver Level), then appropriate trans- 
ceivers must lie between the M-bus and 82495/82490. 
Also M-buses with heavy capacitive loads should be 
redriven by transceivers, although 82495 and 82490 can 
tolerate loads of up to 100 pF. 

An additional advantage of buffering the 82495/82490 
signals with transceivers in a multiprocessor is that a 
"local M-bus" will exist between the chips and the 
main system M-bus. That allows some local traffic from 
the CPU module to attached peripherals to avoid tra- 
versing the M-bus. Such peripherals might include an 
MPIC/CCU (Multiprocessor Interrupt Controller/ 
Concurrency Control Unit), a JTAG boundary-scan 
controller, or a time-of-day clock, as in the Sequent 
Symmetry multiprocessor. / 
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8.0 MBC FUNCTIONS FOR 
■ MULTSPROCESSORS 

Multiprocessor cache designs have additional motiva- 
tions beyond the uniprocessor goal of reducing memory 
access latency. Reducing memory bus usage is especial- 
ly important because the sharing of the bus creates a 
bottleneck. Thus multi-82495 systems need to minimize 
the number of transactions and make each one as short 
as possible. Large caches (256k or 512k) are recom- 
mended for multis, to keep the miss rate as low as pos- 
sible. 

In addition to the uniprocessor functions, an MBC in a 
multiprocessor must handle consistency with caching 
agents other than its own 82495. The multiprocessor 
MBC may also for performance reasons implement 
snoop filtering, cache-to-cache transfers, read-for-own- 
ership, and split transactions. 

Snooping results from listeners (slaves) on the bus must 
be fed back to the master 82495 by the time SWEND# 
is activated, if the system uses writeback policy (write- 



through requires no feedback). These results 
(DRCTM#, MWB/WT#) are translations of the 
slaves' MHITM# and MTHIT# outputs. As shown in 
Figure 15, typically all MHITM# outputs would be 
wired-or via open-collector transceivers. Because slaves 
on the bus may be busy with CPU operations and back- 
invalidations, the snoop delay can vary. Thus a latched 
derivative of the SNPCYC# output of all 82495s 
would be wired-or to derive SWEND # . Alternatively, 
the MBC can count CLKs to generate SWEND #, us- 
ing the worst-case upper-bound of CLKs required for 
all 82495s to snoop, but that makes all snoop windows 
long. 

Because 82495 will tolerate SWEND # arrival up until 
CRDY#, the M-bus data transfer for reads can overlap 
the snooping delay. The transfers (MBRDY#s) can oc- 
cur during snoop latency, and an MHITM# activation 
would cause the MBC to restart the transfer using 
82490's MSEL# pin. 

If a 82495 linefill or writethrough hits a dirty line in 
another cache, the MBC cannot BACKOFF the 82495. 
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Figure 15. Creating Snoop Results from MHITM#, MTHIT#, and SNPCYC# 
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Labeling that other cache "the dirty 82495," and the 
initiating 82495 "the master 82495". The master MBC 
must force a retry of the access after the dirty 82495 
dumps the line, but the master 82495 has no "Backoff 
and Retry" input pin. Rather, on a linefill the master 
82495 must see the data transfer as if it had come from 
memory. On a write, the master 82495/82490 data 
write must wait until the modified line from the dirty 
82495 has been dumped to memory. To do so, the mas- 
ter MBC can either: 

1) Delay the corresponding MBRDY#s to the master 
82490 until the modified line is completely written 
into memory and read out of memory. That implies 
the master MBC will remake the initial request to 
the memory controller after the writeback. 

OR 

2) Create a cache-to-cache transfer, so that the write- 
back data movements go directly into the master 
82490 over the M-bus. A later section describes 
cache-to-cache transfers. Such transfers are quicker 
than waiting for the entire modified line to be writ- 
ten back to memory. 

Note that the 82490 can restart the data transfer for 
reads or writes, in the case of MHITM# activation 
after the first MBRDY# but before MEOC# and be- 
fore CRDY#. To restart the 82490, the MBC must 
deassert MSEL# for at least 1 MCLK. 

Snoop Window Time (the delay from MADS # to 
SWEND#) limits address-bus bandwidth. In the inter- 
val from the address on M-bus until the acknowledge- 
ment (SNPCYC#) by all listeners, no more requests 
(addresses) can be on the bus. This restriction is im- 
plied by: 

1) A typical M-bus has only one MS WEND # wire, 
which cannot be identified with the proper request if 
several requsts are outstanding. 

2) 82495 does not snoop between BGT# and 

SWEND#. 

3) 82495's "restricted backoff protocol". That protocol 
requires the M-line writeback to be the first transac- 
tion by any 82495 which generates MHITM#, and 
82495 cannot snoop anymore until it finishes the 
MHITM# writeback. 

Data for read-misses cannot be transferred on the 
CPUbus until SWEND#, because the MBC cannot 
abort a CPU transfer after giving the first BRDY#. 
Thus the snoop window length influences CPU per- 
formance. Depending on the number of processors, bus 
speed, and memory speed, two scenarios arise from 
snoop window length versus memory access latency: 

1) S Window < Memory Latency: SWEND# precedes 
the MBRDY#s. If MHITM# occurs, the original 
memory access can be aborted and its MBRDY#s 
must be ignored. 



2) S Window > Memory Latency: data transfer on 
M-bus can proceed, with MBRDY#s causing 82490 
linefill buffers to advance. After SWEND#, the 
MBC can begin BRDY#s to the CPU and 82490 if 
MHITM# is inactive. If MHITM# is active, the 
MBC must restart the M-bus data transfers after (or 
during) the writeback from the modified snooper, 
and can begin BRDY#s immediately after the first 
MBRDY#. 

The typical snoop window in a multiprocessor using 
the hardware of Figure 15 is about 7 CLKs total snoop 
turnaround delay, shown in Figure 16: 

1 CLK for propagation delay of master's 

MADS# (to slave 82495s' SNPSTB# in- 
puts) 

+ 0.5 to 1 CLK for 82495 to internally latch 
SNPSTB# and synchronize it to CLK. 

+ 1 CLK for 82495 tag lookup and SNPCYC # 

(or more, if 82495 is busy with SNPBSY#) 

+ 1 CLK to latch SNPCYC # into the MBC 

Set/Reset flip-flop generating MSWEN- 
DA. 

+ 1 CLK for MSWENDA open-collector buff- 

er and settling time from all slaves. 

+ 2 CLKs for MSWENDA to get through syn- 

chronizer (on the master MBC's CLK) and 
inverter to generate SWEND# to the mas- 
ter 82495. 

The window total assumes that the slave 82495s' one 
CLK delay from SNPCYC # until MHITM# is con- 
current with the synchronizer delay for creating 
SWEND# from MSWENDA at the master. Those 2 
CLKs can overlap with the next MADS # if it is asyn- 
chronously generated from MSWENDA. Shorter 
snoop window times can be obtained using duplicate 
external tags as explained later, but this is not trivial. 

Read for Ownership (RFO) protocols decrease bus traf- 
fic by avoiding the M-bus write which would occur 
upon a write-miss. That is, a write-miss would go to the 
bus, followed by a 82495 line allocation request for the 
missed area. With RFO, the MBC does not echo the 
82495 write request to the M-bus. Instead, it asserts 
MFRZ# to freeze the written data in the 82490 memo- 
ry buffer, and allows the subsequent 82495 allocate line 
request to go to the bus. When the line data returns on 
the M-bus, MBC asserts DRCTM# to cause the 82495 
to mark the line as Modified (the memory system and 
other caching agents do not know of the original write 
miss, so they have invalid copies of the line). 

Signals which the MBC must use to do RFO are: 

1)PALLC# (Potential ALLoCate): from the 82495 
must be active on the write miss.If not, RFO cannot 
be performed. 

2) MKEN# and CRDY#: must be activated by the 
MBC for the write, to trigger the 82495's subsequent 
allocation request 
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SNPCYC# 

LSNPCYC# 

MSWENDA# 

SWEND# 

Slave 82495s see SNPSTB# 



' Slave 82495s begin snoop (CLK) after Internally 

synching SNPSTB 



Figure 16. Snoop Waveforms 




3)MFRZ#: must be activated by the MBC to the 
82490 at the time of the MEOC# and CRDY# for 
the write. 

4) INVAL (memory bus Invalidate indication): must 
be asserted by the MBC during the allocate-read to 
force all other 82495s to invalidate their now-obso- 
lete copies of the line. Slave MBCs will assert 
SNPINV to 82495s. 

5) DRCTM# (DiReCt To Modified): must be asserted 
by the MBC during the SWEND# of the allocate, to 
make the 82495 put the line in M-state. 

6) MWB/WT # : must be asserted during the 
SWEND# of the allocate. 

7) CPLOCK# (82495 Psuedo Lock in Intel486 DX 
CPU systems): if active, the MBC must NOT do 
RFO, because 82495 will activate PALLC# only on 
the second of the 2 writes. If the MBC tried to RFO, 
it would merge only half of the data into the modi- 
fied line. 

See [82495/490DS] for RFO information. 

Cache-to-cache transfers (CTCT) optimize the speed of 
consistency actions in a multiprocessor. For a read line- 
fill by a master causing an MHITM # from a slave, the 
writeback data movements go directly into the master 
82490 over the M-bus from the dirty 82490. For a 
write, Read-for-Ownership (RFO) is required for the 
CTCT. If RFO is not implemented, then the cache-to- 
cache option can be used only on linefill (read) misses. 
In fact, RFO makes every write-miss into a linefill. The 
82495/82490 do CTCT only on entire lines, not bytes 
or words. 



For CTCT on a linefill causing MHITM #, the MBC 
doing the writeback must initiate the writeback at the 
subline address of the initial read. Starting the write- 
back from the first word of the line is NOT acceptable. 

While CTCT is faster than re-reading the line after 
waiting for the dirty writeback, the latency will be long- 
er in most systems than for fetching lines from main 
memory. CTCT would actually waste time for such 
items as shared instruction pages. For non-written data, 
transferring from memory to a CPU is probably faster 
than tranferring from another cache. So 82495 supports 
only M-line CTCT (no writeback occurs unless 
MHITM#). 

Signals involved in CTCT are DRCTM#, MZBT#, 
MHITM #, MBAOE#, and MSEL#. See 
[82495/490DS] for CTCT information. 

Snoop filtering can be implemented by the MBC using 
the 82495 SMLN# (SaMe LiNe) output to reduce the 
latency for snooping. That is, SWEND# can be assert- 
ed immediately , to the requesting 82495, if the 82495 
asserts SMLN# to indicate the current request is to the 
same line as the previous request. In that case, other 
caches already have checked this line. SMLN# must 
be ignored if the M-bus has been used by other agents 
between the 2 82495 requests. The M-bus protocol need 
not include a "non-snooped transfer type" for the use 
of this feature, as the MBC can simply ignore the snoop 
responses from other MBC/82495 modules. 
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Split transaction (ST) memory-buses such as Future- 
bus 4- prove valuable in high performance systems. An 
ST (also called "connect/disconnect" or "packet 
switching") bus divides a single read request into a sep- 
arate address-transfer phase and a data-transfer phase. 
Thus the bus is .riot monopolized during the long laten- 
cy involved in accessing data across bus hierarchies. 
Writes typically are not split, as the data and address 
are available simultaneously from the writer. In a hier- 
archical bus, requests must be forwarded across bridges 
for the purposes of snooping and memory access at re- 
mote nodes, and the snoop latency may be long. Thus 
the bus should be freed between initial request and 
snoop-response for use in other transactions. 

The 82495 does not support ST directly. That would 
require snooping current cache contents and queue-up 
possible writebacks, for the accesses from other bus 
agents between the time of the BGT# (the address 
phase) and SWEND# (end of the address phase or 
later). Also 82495 cannot writeback dirty data between 
SWEND# and CRDY# (end of the data phase) of an 
ongoing cycle; it cannot suspend a transfer for later 
resumption after a snoop writeback. 



CADS# 



BGT# SWEND# CRDY# 

—I NNNNNNNNNNN I DDDDDDDDDDD I 



NN - No snooping by 82495 will occur in this area 

DD - Delayed response by 82495 to snoop requests 
here. MTHIT# and MHITM# asserted immedi- 
ately, but writeback of dirty data delayed until af- 
ter CRDY# for ongoing cycle. 



82495's inability to snoop during the NN period comes 
from the need to keep 2 addresses into the tags active- 
one for the outstanding 82495 request, whose tag must 
be updated at SWEND# based on MWB/WT# and 
DRCTM#, and one for the snoop inquiry. Further- 
more, any MHITM# on the M-bus could not be easily 
linked to the request causing the snoop if 2 snoops are 
outstanding. 

To support split transactions by snooping between 
BGT# and SWEND#, a set of tags external to the 
82495 can be implemented in the MBC. Those tags 
would replicate the contents of the 82495 internal tags, 
listening to all memory bus requests and responding 
with snoop results. Only when a 82495 state change (to 
I or S) is needed will the 82495 be informed of snooping 
action — only then will the external tags relay the snoop 
request to it. 

Duplicate tags provide quicker snoop turnaround be- 
cause no SNPCLK-to-CLK synchronization is re- 



quired; the duplicate tags are in the SNPCLK/MCLK 
logic. While they are a high-performance option, they 
are costly and complex. 

Memory cycle abort is required in multiprocessors 
when a snooping 82495 activates MHITM# to signal 
that the memory's copy of the data requested by anoth- 
er 82495 is obsolete. As explained above, the memory 
read or write must be INHIBITED until the writeback 
is done. Depending on implementation, the original ac- 
cess may need to be retried or abandoned. If CTCT and 
RFO are implemented, then abandonment is probably 
adequate. Although the complexity of aborting could 
be avoided by delaying all memory action until 
SWEND#, that would decrease performance. An 
M-bus signal such as "SIV" (System Intervene) or 
"MBOFF#" (M-bus Back OFF) allows the MBC of 
the snooper to tell memory to abort. 

If the M-bus is pipelined, there may be constraints on 
when the MBC can assert the "abort" signal to avoid 
cancelling the access in progress for the transfer preced- 
ing the one causing MHITM#. 



Locking 

Locking of the M-bus using the 82495's KLOCK# 
output is required to ensure atomic accesses for CPU 
locks. For example, memory variables called sema- 
phores in a multitasking airline-reservation system pre- 
vent two processes from trying to update the same list 
of flight reservations simultaneously. A task would read 
the value of the semaphore in an uninterrupted read- 
modify-write (RMW) sequence, asserting the CPU's 
LOCK# signal during the RMW to block interrupts 1 
(and block locked accesses by other processors to the 
same semaphore in a multiprocessor). If interrupts or 
other accesses were allowed during the sequence, two 
processes (or processors) might both read the sema- 
phore as "available" (zero) and both assume ownership, 
setting it to "unavailable" (nonzero). Then both might 
find the same empty seat and write their individual pas- 
senger's name in the same seat location. In the end, 101 
passengers would have tickets for a 100-seat plane 
flight. 

The 82495 and i860 XP CPU implement locks in a 
sequentially consistent, or serializing, manner. That is, 
all data loads and stores within the locked sequence 
occur on the external bus in the same order as they 
appear in the program. Also, all accesses in the pro- 
gram before the LOCK instruction are completed be- 
fore the first locked read or write, and all the locked 
reads/writes complete before other accesses after the 
locked sequence. This sequentiality is required by the 
semaphore example above, to prevent the CPU from 
updating the reservation list before it has obtained own- 
ership using the semaphore. 



iThe CPU automatically blocks interrupts during the LOCKed sequence. The bus arbiter is responsible for blocking other accesses. 
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The MBC must serialize by ensuring all back-invali- 
dates from 82495 to the CPU have completed before 
activating BRDY # for any locked read or write. So the 
MBC must postpone locked BRDY#s until CAHOLD 
is inactive and SNPCYC# has been inactive at least 2 
CLKs (refer to [82495/490DS] section 5.1.1). 



Bus Lock vs. Address Lock 

The 82495 echoes the CPU's LOCK# signal onto its 
KLOCK# output, and forces all CPU accesses to go to 
the M-bus, even if they are 82495 cache hits. That guar- 
antees that other processors know of the LOCK and 
the accesses. The 82495 assumes a BUS LOCK, where 
all other processors are kept off the bus during 
KLOCK# activation. Most existing "standard" buses, 
such as Multibus-II, have lock protocols which do such 
an exclusive lock. 82495 snoop behavior during asser- 
tion of its own KLOCK# is undefined, since it expects 
no other requests will be permitted then. The 82495's 
KLOCK# can remain asserted for multiple cycles 
when used with the i860 XP CPU, because the proces- 
sor allows up to 32 instructions inside a LOCKed se- 
quence. 

The 32-instruction i860 XP CPU LOCKed intervals 
may exceed 32 CLKs, as each instruction could take 
several clocks and cause a TLB miss (the intervals 
would be even longer if the i860 XP CPU did data 
cache line fills and line writebacks during LOCK#, but 
the 82495 prevents that by making KEN# = 1). Unfor- 
tunately, this limits bus concurrency. When several 
82495s share a bus or interconnection network, per- 
formance would improve if a LOCK# from one proc- 
essor did not block all others from accessing memory 
and I/O. Multiprocessors based on the Intel486 DX 
CPU are not affected as severely by LOCK#, because 
its lock endures only a few clocks — two memory ac- 
cesses at most. 

To improve performance of locks in a multiprocessor, a 
scheme of ADDRESS LOCKING may be implement- 
ed. This non-blocking protocol allows other accesses to 
the bus and memory in spite of LOCK# activation, 
and requires only that no other CPU tries to access the 
same LOCKed address. If another CPU does try to 
access the same location, that second CPU must be 
stalled until the first LOCK is de-asserted. To ensure 
that the second CPU continues to snoop accesses while 
stalled, BGT# to it for its request must be delayed 
until the lock is obtained, as signalled by the bus arbi- 
ter. Semaphore integrity is preserved if all CPUs follow 
the software convention of locking their RMW (Read- 
Modify- Write) semaphore accesses. Also by conven- 
tion, the address corresponding to the first access with 



LOCK# asserted is the only locked location permitted 
to that processor, until LOCK# deasserts (refer to the 
i860 Microprocessor Family Programmer's Reference 
Manual Intel order #240875, Section 5-14). 

Would software want to be able to cache lockable loca- 
tions? Since they are used for interprocessor or inter- 
process communication, it might seem dangerous to 
keep them "hidden" in a cache. However, caching al- 
lows a CPU to read a semaphore repeatedly without 
generating bus traffic, waiting until the semaphore is 
free as indicated by a zero value. These reads can be 
done in non-locked fashion. If a copy of the semaphore 
is cached, no bus traffic is used for the reads, and the 
semaphore value still gets updated via the normal 
MESI consistency hardware when the semaphore's 
owner writes it with a new value. 

KLOCK# de-assertion for back-to-back Intel486 DX 
CPU locked accesses is required of the MBC if it uses 
address-based locking, so that the lock-manager knows 
the correct address. The i860 XP CPU always deacti- 
vates LOCK # for at least one clock between separate 
locked regions, by virtue of its deactivation in the clock 
after the last locked ADS#. However, the Intel486 DX 
CPU deactivates LOCK# only in the clock after the 
last BRDY# of the last locked access. Thus LOCK# 
and KLOCK# may not deactivate when two XCHG 
instructions occur in succession. The MBC can insert a 
deactivation of the M-bus MLOCK# signal by know- 
ing all Intel486 DX CPU locked accesses are Read- 
Modify-Write sequences. The MBC should deassert 
MLOCK# regardless of KLOCK#'s value, after the 
write. 

Deassertion of KLOCK# by the MBC hardware may 
be required in any Intel486 DX CPU system, to avoid 
bus timeout and starvation of other bus masters when a 
continuous stream of locked accesses occurs in one 
processor's program. Without it, one processor could 
monopolize the bus and prevent re-arbitration. 



CPLOCK# 

CPLOCK# has a purpose similar to KLOCK# in 
Intel486 DX CPU systems, but is unused in i860 XP 
CPU systems. PLOCK# (Psuedo-LOCK) indicates an 
atomic 8-byte 2-transfer write for floating-point data 
which should not be interrupted. The 4-byte bus of the 
Intel486 DX CPU requires 2 transfers for an 8-byte 
datum, and if only half the transfer gets done before 
another bus master reads memory, half-wrong data 
could be read. 
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Thus the MBC should not relinquish the bus nor re- 
quire snoops of its 82495 from the time of the BGT# 
for the first write (when CPLOCK# was asserted by 
82495) through the BGT# of the second write. This 
increases the worst-case delay of writeback for a 82495- 
snoop-hit to a modified line; to avoid the delay, the 
MBC can tie the CPLOCK#[PLOCKEN] pin low to 
disable PLOCK functionality. 



9.0 MORE ALTERNATIVES 

In addition to the options discussed above, several oth- 
er choices affect Memory Bus Controller design. 

M-bus clocking should be chosen to allow future ver- 
sions of 82495 and 82490 at higher clock speeds. Up- 
grading the CPU module performance by replacing the 
processor and 82495/82490 will be possible. While 
some redesign of the CPU-side MBC state machines 
may be needed for faster clocks, the memory bus can 
remain the same. Thus an asynchronous interface with 
either a strobed unclocked M-bus or a clocked M-bus at 
less than 50 MHz is advised. A fully synchronous 
M-bus/CPU MBC would be difficult to move to higher 
clock speed. 

One convenient way to design the MBC is with the 
M-bus MCLK = 0.5*CLK. Probably it will be possi- 
ble to keep the M-bus at half the CPU CLK rate, even 
with faster CPUs. The big advantage of this half-speed 
link is that no synchronizers are needed within the 
MBC if the MCLK and CLK edges are skew-con- 
trolled. The MBC can be totally on CLK, as in the 
design example of Appendix B. 



The choice between a Strobed or Clocked M-bus is of- 
ten determined by existing bus protocols in which 
82495/82490 will be used. Most existing buses are 
clocked; however, Futurebus + requires all bus entities 
to use strobed tranfers, but allows an optional clocked 
mode for high-speed packet transfers [Fbus90]. The 
tradeoffs are shown in Table 2. 

Line size and M-bus width also determine upgradabil- 
ity to possible future versions of 82490 on the same 
M-bus, with more than 32kB per chip. If a higher-den- 
sity 82490 becomes available, the fact that 82495 has 8k 
tags requires: 

128 data bytes per tag (128 byte line, or sectored 
64-byte lines) 
AND 
8-byte or 16-byte memory bus width 

to allow a 1 MByte or 2 MByte 82490 configuration. If 
a smaller bus is used, a larger 82490 is possible, but 
the bus-size multiplexing described earlier would be 
needed. 

Writeback (WB) cache policy is advised for high-per- 
formance (multi)processors to limit bus traffic. Howev- 
er, a writethru (WT) design is simpler for the MBC 
because there never is a need to backoff the 82495 due 
to MHITM#. In fact, the snoop window in a WT sys- 
tem becomes unnecessary and SWEND# can be acti- 
vated simultaneous with KWEND#. In such a system, 
the only states of cache lines are S or I. Snooping has 
no effect during reads and only causes invalidations (in 
the slaves) for writes in a WT design. Cache-to-cache 
transfers and RFO are irrelevant. 



Table 2. Clocked vs. Strobed MBUS Tradeoffs 



CLOCKED MBUS Advantages 

Design techniques for clocked systems are well 
known. 

Fast arbitration using MCLK state machines. 
Burst transfers proceed at one datum per MCLK 

CLOCKED MBUS Disadvantages 

Must round-up delays to MCLK period quanta EG., 
33 ns delay means two 30 ns MCLKs needed. 

Some 82495-to-82495 signals must be twice 
synchronized: once at sender, once at receiver. 

Backplane length limited. 

MCLK skew must be controlled. 

Requires assumptions on CLK vs.MCLK speed 
ratio: for example, CLK > MCLK > CLK/2. 



STROBED MBUS Disadvantages 

MBC design may require delay lines and non- 
conventional design techniques. 

Arbitration slow because signal must be 
synchronized at arbiter and at modules. 

Burst throughput slowed if each transfer requires 
acknowledgement from receiver. 

STROBED MBUS Advantages 

Delays determined by device speed and physics, 
not by MCLK quanta. 

Each signal goes through sychronizer once, only at 
receiver, so less time is lost at synchronizers. 

Fewer limits on backplane length or capacitance 
or number of boards. 

No clock skew worries. 

Any CLK frequency will work. 



2-474 



AP-452 



IPIMyMOIMGW 



10.0 MBC DIFFERENCES FOR i860 
XP CPU VERSUS Intel486 DX 
CPU 

The same MBC design can be used for either i860 XP 
CPU or Intel486 DX CPU if the MBC supersets the 
requirements of the two. A "CPU TYPE" configura- 
tion pin can be included in the MBC to modify its be- 
havior. First, make the features as common as possible: 

° Choose a configuration acceptable for both CPUs: 

a) 256 kBytes, 4 transfers/line, 64-bit M-bus, 32-byte 
line. 

b) 512 kBytes, 4 transfers/line, 128-bit M-bus, 
64-byte line. 

c) 256 kBytes, 8 transfers/line, 64-bit M-bus, 64-byte 
line. 

d) 512 kBytes, 8 transfers/line, 128-bit M-bus, 
128-byte line. 

o i860 XP CPU-pfld data is cached in 82490— no opti- 
mizations are included for pfld. 

° Assume that LOCK # duration does not matter (IE, 
that back-to-back LOCK#ed requests from 
Intel486 DX CPUs and long LOCK# cycles in i860 
XP CPU do not cause bus ownership timeout). 

Features Strictly for the Intel486 DX CPU : 

° BE7-4# for M-bus must be synthesized by the 
MBC from A2 and BE3-0#. 

o CPLOCK# protection. 

° WRMRST (warm reset) can be included for both 
CPUs, but is optional. 

Features Strictly for the i860 XP CPU: 

° Burst writes from the CPU (Length = 2 and 
Length = 4). 

o A second 74F377 BE # -latch is needed, for i860 XP 
CPU pins BE7#-BE4#, LEN, and CACHE#. 
PCYC and CTYP can also be latched for debug pur- 
poses. 

o PCHK# output from i860 XP CPU must be ig- 
nored except during the CLK after BRD Y # comes 
from the MBC. PCHK# from Intel486 DX CPU is 
always valid. 

Differences between the MBCs: 

° Configuration pin strapping of 82495 inputs. 



° Decoding CPU request burst length from 
CLEN 1:0(82495 pins in Intel486 DX CPU systems) 
or LEN and CACHE # (i860 XP CPU). 

o CPU Line length— 16 bytes vs. 32 bytes (i860 XP 
CPU) means that the Intel486 DX CPU MBC will 
give 2 BRDY#s for every 1 BRDY# of the i860 
XP CPU MBC. 

Differences between Intel486 DX CPU and i860 XP 
CPUs which have no impact on MBC: 

o Intel486 DX CPU FLUSH # input pin. 

o i860 XP CPU writeback caching, HITM#, and 
BOFF#. 

o i860 XP CPU CS8 vs. Intel486 DX CPU BS8#, 
BS16# (none are really useable). 

o Intel486 DX CPU RDY# pin and interruptable 
bursts (not useable with 82495). 

o i860 XP CPU acknowledges HOLD during 
LOCK#. 

o EADS# duty cycle (50% maximum for i860 XP 
CPU and 100% for Intel486 DX CPU, but handled 
by 82495). 

KEN# pin sampling interval by the CPU. 

° Behavior of CPU in response to BOFF# assertion. 

o i860 XP CPU BERR (Bus ERRor) pin versus 
Intel486 DX CPU NMI (Non Maskable Interrupt). 



11.0 SUMMARY 

The interface between a CPU/82495/82490 chip set 
and a system memory bus allows much flexibility and a 
wide range of performance options. The simplest MBC 
can be a few PALs, while a top-performance multipro- 
cessing version may take thousands of gates on an 
ASIC. Signal pin counts for the MBC can range from 
70 to 120, varying with the memory bus definition im- 
plemented by the MBC. 

While beyond the scope of this document, topics for 
consideration include detailed timing diagrams, critical 
path analysis, simulation of bus traffic, and hit rates. 
Useful also are simulations of performance impact of 
the number of CPUs, WB versus WT policy, memory 
latency, CTCT, RFO, and duplicate tags. Also at issue 
are interrupt controller hardware, PAX concurrency 
control, boundary scan and selftest, PC-compatibility- 
implications, i860 XP CPU pfld options, and high- 
speed design issues of impedance, termination, and 
noise. 
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Q: Why activate BGT# early, since 82495 won't 
snoop between BGT# and SWEND#? 

ANS: CNA# for MBC pipelining ignored until 
BGT#. Also BGT# must precede CRDY# by 
at least 3 CLKs. And BGT# must precede 
BRDY#. 

Q: How does PAX multiprocessing work with 
82495 and an MBC? 

ANS: A CCU chip must be included on the M-bus side 
of 82495 and 82490 for each i860 XP CPU in a 
PAX multiprocessor. Refer to [MPIC90]. 

Q: Can the i860 XR CPU use a 82495/82490 
cache? 

ANS: No, the bus protocol of 82495 and 82490 
matches Intel486 DX CPU and i860 XP CPUs, 
but not i860 XR CPU. 

Q: Can 2 CPUs plug into one 82495, getting effi- 
ciency from shared cache? 

ANS: No, the protocol and physical capacitance of the 
interface do not allow it. 

Q: Should the same MBC be used for Uni & Multi? 
(i.e., how much extra logic is added to make a 
multiprocessor MBC?) 

ANS: It is possible, and the extra logic is reasonable 
for a Uni which could be upgraded to multi by 
adding another CPU + cache module. 

Q: Are software models of 82495/82490 available 
for simulation of MBCs? What simulators are 
supported? 

ANS: As of September 1990, beta versions of models 
will be available Q4 1990 from Silicon West, Inc. 
Phone = (213)597-5995, FAX = (213)494- 
4588. Contact Silicon West for information on 
simulators supported (currently Workview, Ver- 
ilog, Zycad VHDL, Mentor Graphics). 

Q: What is the fastest possible transfer of data from 
Mdata to Cdata? (i.e., how many CPU elks are 
spent?) 

ANS: The initial timings are listed in [82495/490DS]. 
They are about 1.5 CLK periods including set- 
up-time at the CPU data pins. The connection 
from CDATA to MDATA is essentially a flow- 
through path. 



Q: Can the CPU-bus and Memory-Bus be on the 
same 50 MHz clock? 

ANS: Yes, but multiprocessor memory buses probably 
have too much capacitance and trace length to 
tolerate a 50 MHz clock. 

Q: What are pin-counts for an MBC (i.e., will it fit 
in my ASIC)? 

ANS: 70 to 120 signal pins, depending on the bus pro- 
tocol and MBC features. 

Q: How long is a reasonable cacheability window, 
in MCLKs? 

ANS: KWEND# is activated when MKEN# and 
MRO# are stable. MKEN# and MRO# can 
come from address decoders in the MBC or on 
the MBUS. Thus KWEND# could be 2 CLKs 
after CADS# if the MBC itself determines 
cacheability, or as much as 5 MCLKs if the M- 
bus must see the request and determine 
MKEN#. 

Q: How long is a reasonable snooping window, in 
CLKs? 

ANS: MWB/WT# and DRCTM# are generated 
from the snoopers' MTHIT# and MHITM# 
signals. Thus SWEND# is activated when those 
signals (MWB/WT#, DRCTM#) are stable. 
That would be at least 7 CLKs, not counting the 
possible delay between CADS# and its M-bus 
counterpart MADS#. (see the discussion of 
snoop window above). 

Q: Is the S WEND # window length deterministic, 
or must SNPBSY# determine it? 

ANS: It is deterministic, but may be long when the 
82495 is busy. Yes, the SNPCYC# signal is re- 
quired to determine SWEND#. If SNPCYC# 
is not used, then the worst-case 82495 delay 
must be imbedded into the MBC logic, making 
the window longer than necessary most of the 
time. 
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Q: How long can 82495 be "busy", activating 
SNPBSY# and ignoring subsequent SNPSTB# 
activations? 

ANS: 82495 busy-ness is not due to CPU requests, be- 
cause 82495 gives higher priority to the snoops. 
But for snoops to M-state 82495 lines, 82495 
must do inquiries to the i860 XP CPU and get 
the more-recently modified data from i860 XP 
CPU before 82495 can writeback. A 82495 con- 
nected to an Intel486 DX CPU does not need to 
get modified data, as the Intel486 DX CPU has 
only S-state lines in the CPU cache. However, if 
SNPINV was active, 82495 must back-invali- 
date either CPU for S, E, or M state lines. The 
82495 must do multiple inquires or invalidates 
when the line ratio is 2 or 4. 

Q: What is the synchronization penalty in snooping 
(ie, how long from M-bus request to MHITM # 
validity)? 

ANS: About 3 CLKs. See the discussion of "snoop 
window" above. 

Q: What is optimal 82495 cache-line length 
(32,64,128)? 

ANS: This is TBD from simulations or measurements. 
It depends on the behavior of SW applications 
the HW is intended for. 

Q: Can Futurebus+ be used as the M-bus for a 
82495/82490 system? 

ANS: Yes. The Futurebus+ spec is compatible with 
the 82495/82490. It supports MESI, strobed 
data transfer, address pipelining, cache to cache 
transfers, Read For Ownership, and many other 
features. 82490 would be used in strobed mode 
for Futurebus + . 

Q: Can 82495 do a split-transaction bus (if not, why 
not?)? 

ANS: Maybe. 82495 implements a restricted-backoff 
protocol to eliminate potential deadlock condi- 
tions in a shared bus multiprocessor environ- 
ment. Because of that protocol, and the fact that 
82495 will not snoop between BGT# and 
SWEND#, it is difficult to implement split 
transactions. It may be possible, using an addi- 
tional set of tags which replicate 82495's and 
allow snoops to continue between BGT# and 
SWEND#. 

Q: Can another 82495 be used for the "duplicate 
tags" for split transaction snooping? 

ANS: No, the 82495 signal definitions and protocols 
make that very difficult. 

Q: Why do the KWEND# and SWEND# signals 
exist? 

ANS: SWEND#, by gating 82490-to-CPU-data-trans- 
fer, allows the M-bus data transfer simultaneous 
with snooping. In the usual case, no modified 



copy will be found by the snoopers, so that 
transfer was not wasted. The alternative (that 
data cannot be transfered from memory until 
snoops complete) costs performance or requires 
a central tag directory, SWEND# triggers the 
82495 to update its tags. 

KWEND # allows a variety of cacheability de- 
termination schemes — a long delay to determine 
MKEN# and MRO# might be needed if a pro- 
grammable RAM or EEPROM decodes cachea- 
bility based on address. If not, KWEND # can 
be activated quickly if there is a local MBC de- 
code of A31:A28 to determine MKEN#, for ex- 
ample. 

Q: Why not just one WEND signal? 

ANS: Performance. KWEND # can be determined 
quicker than line-status in most implementa- 
tions. The early knowledge of cacheability to the 
82495 allows it to begin line replacements and 
allocations, and activate the next CADS# to 
MBC. 

Q: How to connect 8-bit (or 16-bit) devices such as 
ROM and serial ports to 82490? 

ANS: If the devices are made non-cacheable, they can 
be tied to the MDATA pins of the least-signifi- 
cant 82490s. However, if fetches from them 
must be cacheable, then byte assembly logic 
(latching transceivers) must exist to allow 82490 
to transfer from them 4 or 8 bytes at a time 
(1 M-bus width per transfer). 82495 and 82490 
require all cacheable locations to do burst trans- 
fers an M-bus-width of data per transfer. 

Q: Does the 82495 have a CS8 mode? Does 82495 
support i860 XP CPU in CS8 mode? 

ANS: To support i860 XP CPU CS8 mode with 82495, 
the 8-bit ROM must be marked non-cacheable. 
This means that code being fetched in CS8 mode 
won't be cacheable in the 82495 or the i860 XP 
CPU. For an 8-byte M-bus, the ROM data pins 
must be wired to the M-bus (MDATA of 82490) 
bits 7:0. For a 16-byte M-bus, the ROM must 
attach to M-bus bits 7:0 AND bits 71:64, which 
would require an 8- bit transceiver at the ROM. 

Q: Should the DRAM controller be part of the 
MBC? 

ANS: For a simple uniprocessor, perhaps. Multipro- 
cessors would have a DRAM controller for 
(each bank of) main memory, separate from the 
MBCs. 

Q: How can the system implement retry upon an 
M-bus parity error? 

ANS: The MBC must re-issue the initial request, and 
reset the 82490 transfer logic using the MSEL# 
signal. 
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Q: Can 82490 use an ECC corrected-bus? 

ANS: ECC (Error Correcting Code) can be used on 
the main memory bus, but the ECC check bits 
must be converted to parity or discarded before 
feeding the 82490. ECC would have to be gener- 
ated at the 82490 MDATA pins for writes to 
memory. 

Q: Can the MBC implement cache-to-cache trans- 
fer on a write? 

ANS: No, the 82490 cannot "snarf ' write data. That 
is, it does not merge a write (partial line) from 
the M-bus with existing cached lines. It can do 
Read-For-Ownership, merging write-miss data 
with an incoming line writeback from another 
cache. 

Q: Can semaphores be cached in 82495/82490? 

ANS: Yes, but all read/writes which are locked are 
forced onto M-bus. So the semaphore would be 
read repeatedly without locking, until it is 
"free". Then SW would re-read it in locked fash- 
ion to obtain ownership. 

Q: Is there any advantage to making semaphores 
cacheable, if all locked accesses go to M-bus? 

ANS: Yes, SW can repeatedly read the semaphore 
without LOCKing it, and no bus traffic thus is 
generated, waiting for the release of the sema- 
phore by any other master. 

Q: Can a single multiplexed address + data bus (like 
Multibus-II) be used for M-bus? 

ANS: Yes, but transceivers external to the 82495 and 
82490 are required. 

Q: How does the MBC implement a "BACKOFF" 
when another 82495 activates MHITM#? 

ANS: If the data requested from a master 82495 is 
Modified in a snooper 82495, the master BC 
must postpone CRDY# until the modified line 
is deposited in the master 82490, after the 
snooper flushes the modified line to M-bus. 

Q: Can MBC duplicate the CPU cache tags, to 
avoid unnecessary inquire cycles? 

ANS: Yes, but the performance benefit may not war- 
rant the extra hardware. 



Q: Can i860 XP CPU Late-Backoff mode be used 
with 82495? 

ANS: No. 

Q: What are the advantages and disadvantages of 
doing an asynchronous system (where MCLK is 
not the same as CLK)? 

ANS: Designers can easily upgrade the CPU side to 
higher frequencies (above 50 MHz) by faster 
PLDs in the CPU side of the MBC. The M-bus 
interface and all modules on the M-bus will not 
need to be changed. It easier to design a board 
when most parts run at a lower frequency. 

Q: If the 82490 is reading information from the 
memory bus and the MBC is generating 
BRDY#'s (RDYSRC= 1), can the MBC abort 
the cycle by giving a premature CRDY#, and 
restart it? 

ANS: The MBC can abort a memory bus cycle but 
cannot abort a CPUbus cycle. Once the first 
BRDY# is generated the cycle must complete. 
On the memory bus, a cycle is not aborted by 
giving an early CRDY#. In fact the 82495 does 
not understand that a cycle has been aborted. 
Only the MBC and 82490 are involved. The 
82490 allows its buffer to be reset using the 
MSEL# signal. 

Q: What is the purpose of 82490 having a separate 
MOCLK for output data, in addition to the 
MCLK for input signals? 

ANS: MOCLK allows greater hold time for writes 
from 82495, if it is skewed slightly from the 
MCLK which M-bus receivers use. MOCLK 
and MCLK must be exactly the same frequency. 
If the skew is not needed, MOCLK can be tied 
low. 

Q: How many levels of pipelining can the 82495 use 
on the external memory bus? 

ANS: Each 82495 can use one level of pipeline on the 
memory bus, so the bus pipe depth can be great- 
er in a multiprocessor. A uniprocessor allows 
just one level of M-bus pipeline. 
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APPENDIX B: 
Intel486 DX CPU Uniprocessor MBC Design 



Please refer to Application Note AP-458, Designing a 
Memory Bus Controller for a 50 MHz Intel486 DX Mi- 
croprocessor Based System. (Intel order #241166). 
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APPEMDIIX C: 
CPU DUAL-PROCES 



OVERVIEW 

This section presents a design for a memory bus con- 
troller for a system containing two i860 XP processors, 
each with an 82495XP/82490XP secondary cache. This 
MBC, together with an i860 XP CPU, 82495XP, and 
82490XP, comprises a core which interacts with a 
memory bus utilizing a bus protocol similar to that of 
the i860 XP CPU. 

The design presented here features an i860 XP CPU 
and 256 KB of 82495/82490 cache running at 50 MHz 
in each core. The clocked 64 bit ( + 8 parity) memory 
bus is asynchronous to the CPU and cache clock, al- 
lowing memory to run at lower speeds for more eco- 
nomical and convenient memory design. The MBC fea- 
tures snooping and pipelining to the memory, as well as 
advanced 82495 processes like write allocation, read for 
ownership and cache-to-cache transfers. 



ASSUMPTIONS 

The implementation presented here is a two processor 
design which can be extended to more than two CPUs. 
The definitions and examples given in this appendix are 
specific to the two processor version. The section 
Extension to 3 or More Processors gives specifics for 
larger systems based on this design. 

The memory bus is 64 bits data plus 8 bits parity. 

The MBC design allows the processor to run at a high- 
er clock frequency than the memory bus. The frequen- 
cies are constrained such that the ratio of the frequency 
of the processor CLK and the frequency of the memory 
bus MCLK is between 1 and 2: 
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Figure C-1. Pinout Environment of MBC 
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This constraint ensures proper synchronization of sig- 
nals which cross between the MCLK portion of the 
MBC and the CLK portion. The prototype was de- 
signed and simulated with a CPU speed of 50 MHz and 
a memory bus speed of 33 MHz. 

Snooping mode can be independently set to strobed, or 
clocked in each core. 

The main memory is responsible for returning the 
MKEN# attribute to the memory bus controller in the 
MCLK following MADS # assertion. 

To save synchronization clocks, the MBRDY# signal 
of the protocol is defined to be asserted one MCLK 
before data is actually available. 

The 82495 operates with 32 bytes/line, 1 line/sector, 
and requires 4 memory bus transfers per line fill. 



OPTIONS 

With modifications the 82495 can operate in a mode 
with 64 bytes/line, 1 line/sector, requiring 8 memory 
bus transfers per line fill. 

The design here utilizes the 82490's clocked memory 
bus mode. The strobed mode can also be utilized by 
making modification to the design. 

Support for various 82495 PFLD modes can be added 
to the design. 

Operation with either write-through or write-once pro- 
tocol can be performed. 



MEMORY BUS PROTOCOL 



M-bus Signals 

The system M-bus resembles the i860 XP CPU bus. It 
allows CPU modules with or without external cache on 
the same M-bus, so that balance between high perform- 
ance and low cost can be achieved. The signal specifica- 
tions below indicate Input (I), Output (O), or bidirec- 
tional (I/O) from the MBC's point of view. Output 
signals to the memory bus such as MADS#, MLEN, 
and MA31:MA3 are floated by all MBCs except the 
one currently owning the M-bus. 

Signals whose names begin with Y (as in YBGT#) are 
in the MCLK side of the MBC, while an X prefixed 
name is in the CPU CLK side of the MBC. The X and 
Y signals are internal to the MBC. 



MRESET (I) - Memory bus RESET 

This signal forces the CPU to begin execution in a 
known state. It resets all MBC machines which are 
driven by MCLK. It is also synchronized (via a 2-stage 
synchronizer) to CLK and fed to the RESET inputs of 
the CPU, 82495, 82490s and all MBC machines which 
are driven by CLK. 



MADS# (I/O) - Memory bus ADdress Strobe 

This signal indicates that a new valid bus cycle is cur- 
rently being driven. The cycle address (A31:A3) and 
cycle specifications are valid in the MCLK that 
MADS# is asserted. A pipelined MADS# will be is- 
sued only after the MBC knows that the current cycle 
is guaranteed not to be aborted. For most memory ac- 
cesses, the master will assert MSNPSTB# to snoop 
other caches on the bus. When MSNPSTB # has been 
asserted, MNA# will cause a new MADS# to be is- 
sued after MSWENDI# signifies snooping has com- 
pleted. Furthermore, if MHITMI # was asserted with 
MSWENDI# in this case, the new MADS# cannot be 
issued until after the current cycle (now a snoop write- 
back) has been completed. When MHITMI # is not 
asserted with MSWENDI#, MADS# can be asserted 
immediately following MSWENDI#. If MSNPSTB # 
was not asserted for the current cycle, then MADS # 
could be issued immediately after MNA#, without 
waiting for MSWENDI#. 

For read cycles MADS# is issued after CADS#, re- 
gardless of CDTS# state. Requesting the memory bus, 
via MBREQ, is also done immediately after CADS#. 
This is due to the fact that CDTS # in a read cycle does 
not affect the memory bus, but indicates when the first 
BRDY# can be issued to the CPU. 

For memory writes MADS# is issued only after 
CDTS#. Requesting the memory bus, via MBREQ, is 
also done after x CDTS#. This guarantees that for write 
cycles the memory bus data is valid 1 MCLK after 
MADS# (similar to the CPU). 



MNA# (I) - Memory bus Next Address 
Acknowledgement 

This is the memory bus next address signal, driven by 
the memory controller. It indicates to the MBC that 
the memory bus is ready to accept a new bus cycle, 
although the previous one has not been completed yet. 
If the MBC has a new cycle pending and the current 
cycle is guaranteed not to be aborted (see MADS# 
above), then a new MADS # will be issued. Note that 
the maximum level of pipelining on the memory bus is 
1. 
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MBRDYt? (I/O) - Memory bus Burst ReaDY# 

This is the burst ready signal. For read cycles, 
MBRDY# indicates that in the following MCLK the 
memory bus will present valid data on the 82490 
MDATA pins. For writes, MBRDY# indicates that in 
the following MCLK the memory bus will accept the 
data from the 82490 MDATA pins. Note that this sig- 
nal is active 1 MCLK before the data is available on the 
memory data bus. This reduces the synchronization 
penalty between the M-bus and CPUbus by 1 MCLK 
period. 

For a clocked-asynchronous MBC, MBRDY# is de- 
layed by the MBC 1 MCLK and passed to the 82490 
MBRDY# pin. For a strobed-asynchronous MBC, the 
82490 MISTB and MOSTB will change value in re- 
sponse to MBRDY#. 

For Cache to Cache Transfers, the MBC with the Mod- 
ified line drives MBRDY# active once per MCLK 
without wait states for the duration of the line burst. 



MSNPSTB# (I/O) - Memory bus SNPSTB# 

This is the memory bus snoop strobe signal. It is assert- 
ed 1 MCLK after MADS# by the MBC which asserted 
MADS#, for all cycles that could be M-state in the 
other MBC. In writebacks and I/O cycles, 
MSNPSTB# is not asserted. The MSNPSTB# output 
of each MBC is connected to the 82495 SNPSTB# in- 
put of the other MBC, in this two processor design. 



MSWENDO# (O) 
Output 



Memory bus SWEND^ 



This is the memory bus snoop window end indication 
which is driven by the snooping MBC. It is connected 
to the master MBC's SWENDI# input, indicating that 
snooping is finished and the snoop attributes are valid. 

MSWENDO# is an asynchronous signal which is 
triggered by the 82495 SNPCYC# falling edge, and 
is negated after sampling an active SNPSTB # . 
MSWENDO# of one MBC is connected directly to the 
MSWENDI# input of the other MBC. 

MSWENDI# (I) - Memory bus SWEND# Input 

MSWENDI# is connected directly to the other core's 
MSWENDO# output. It is internally sent to two syn- 
chronizers: synchronized to CLK to generate 82495 
SWEND#, and synchronized to MCLK for MBC state 
machines which determine whether the current bus cy- 
cle should be aborted. 



MSWENDI# indicates the end of the snoop window 
and that the snoop results MHITMO# and MTHIT# 
are valid. An active MHITMI# indicates a snoop hit 
to a modified line, and causes the master MBC to dis- 
card any data which has arrived from main memory, so 
that new data, which is being written out as the snoop- 
ing core performs a snoop write back, can be accepted. 
MTHIT# of each core is connected to the 
MWB/WT# input of the other core, to generate the 
WB/WT# signal to the 82495. 

MHITMO# (O) ■ Memory bus HITM# Output 

This indicates a snoop hit to a modified line. In the two 
processor implementation of this MBC, it is connected 
directly to the other MBC's MHITMI# input. 

MHITMI# (I) - Memory bus HITM# Input 

MHITMI# is connected to the MHITMO# output of 
the other MBC, and determines if MBOFF# and 
M ABORT # will be asserted. It is sampled on 
MSWENDI 7 # activation. 

MTHIT# (O) - Memory bus Snoop Hit Indication 

This snoop hit indication is based on the 82495 
MTHIT# output. The MTHIT# ouput of the snoop- 
ing core is used by the master core to determine the 
WB/WT# state for the accessed line. The 82495 
MTHIT# signal is passed directly onto the memory 
bus when the SNPINV signal is inactive for the snoop. 
On snoops with SNPINV active, the memory bus 
MTHIT# line is driven low, regardless of the value at 
the 82495 MTHIT# pin. 

The MTHIT# signals from the memory bus control- 
lers on the bus are wire-anded together. Because the 
82495 MTHIT# output only changes state with each 
new snoop, the master memory bus controller must 
float its MTHIT#. 



MBOFF# (O) - Memory bus BOFF# 

This is the memory bus back-off signal which is driven 
by the master MBC. The master MBC floats its bus 
concurrent with MBOFF# activation. When the 
snooper MBC samples an active MBOFF# and it has a 
pending snoop write-back cycle, it issues the cycle to 
the memory bus. Note that the snooper issues the cycle 
even though it is still in a bus hold state (MHLDA 
asserted). If MHITMI# is sampled active during 
MSWENDI# and the previous cycle has completed, 
then MBOFF# will be asserted immediately after 
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MSWENDI # . If the previous cycle has not completed 
and the pipelined cycle hits a modified line, then 
MBOFF# will be asserted only after the previous cycle 
completes. The snooping MBC floats its bus only after 
the snoop write-back cycle has completed. Note that 
from the arbiter's viewpoint the bus is still granted to 
the master MBC. 

M ABORT # (O) - Memory bus Abort 

This is the memory bus abortion signal which is driven 
by the master MBC. When the main memory samples 
an active M ABORT # it aborts any cycle that is cur- 
rently being serviced. The memory aborts the cycle re- 
gardless of the number of MBRDYs that have been 
issued. Thus MBRDY# of the aborted cycle will not 
be issued after MABORT#. A new cycle could be serv- 
iced immediately after MABORT#. 

If MHITMI# is sampled active during MSWENDI # 
and the previous cycle has been completed, then MA- 
BORT# is asserted immediately after MSWENDI #. 
If the previous cycle has not been completed and the 
pipelined cycle hits a modified line, then M ABORT # 
is asserted only after the current cycle has completed. 

M ABORT # can also be asserted during read for own- 
ership with a hidden write (allocation after a non-com- 
pleted write in the main memory). In this case if the 
master MBC samples an active MKEN# (1 MCLK 
after MADS#) during a potentially allocatable write 
cycle, it asserts MABORT# immediately, i.e. 2 
MCLKs after MADS #. 

Note that M ABORT # is always guaranteed to be a 1 
MCLK width pulse. 

MLOCK# (I/O) - Memory bus LOCK 

This signal does not exist in the current implementa- 
tion. Instead, the MBC simply refuses to give up the 
M-bus to the arbiter when it is running locked accesses. 

MHOLD (I) - Memory bus Hold Request 

When this input to the MBC is asserted, the MBC as- 
serts MHLDA and floats all inputs and outputs except 
MBREQ, MHLDA, MSWENDO#, and MBOFF#. If 
the MBC has outstanding bus cycles in progress 
(MADS # has been asserted), they are completed be- 
fore the MBC relinquishes the bus. MHOLD is recog- 
nized during MRESET assertion. 



MHLDA (O) - Memory bus Hold Acknowledge 

The memory bus hold acknowledge signal goes active 
when an MBC relinquishes the bus in response to an 
MHOLD request. The memory bus controller floats its 
bus in the same MCLK that it issues the MHLDA. 
When the MBC leaves bus hold, MHLDA is negated 
and the core resumes driving the bus. If a cycle is pend- 
ing when leaving bus hold, the MADS# will be issued 
in the same MCLK that MHLDA is negated. 



MINT (I) - Memory bus Interrupt 

This interrupt signal is connected directly to the i860 
XP CPU in the core. 



MKEN# (I) - Memory bus KEN# 

This is the memory bus cache enable signal. It is used 
by the MBC to determine the length of the current bus 
cycle, and is also connected directly to the 82495 
MKEN# input. 

In potentially cacheable read cycles, it determines cycle 
length. In potentially allocatable write cycles, it deter- 
mines whether read for ownership with hidden write 
will be performed. 

In the current implementation, MKEN# must be driv- 
en by the memory controller in the MCLK after 
MADS # was issued. 



MRO# (I) - Memory bus Read Only 

Assertion of this signal causes an access to be treated as 
read only by the core. This signal is connected directly 
to the 82495 MRO# input, as well as to the MBC. 



MWB/WT# (I) - Memory bus WB/WT# 

This is the write-back/write-through input connected 
to the memory bus. It is connected through MBC logic 
to the 82495 MWB/WT# input. 



MDRCTM (I) - Memory bus Direct-to-M 

This is the memory bus DRCTM# signal which forces 
a line entering the cache to be placed directly in the 
[M] (modified) state. In addition to this signal which is 
connected from the memory bus to the 82495, the MBC 
can internally drive the 82495's DRCTM# pin during 
read-for-ownership cycles. 
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MFLUSH#, MSYNC# (I) 
FLUSH #, SYNC# 



Memory bus 



These signals cause the core to flush or sync its cache, 
by asserting FLUSH # or SYNC# to the 82495, re- 
spectively. The signals are driven by the main memory 
controller upon detecting a Core flush or sync com- 
mand, which consists of a special cycle with either 
MBE1# or MBE3# active, respectively. 

MBREQ (O) - Memory bus Request 

The MBREQ # signal is asserted by an MBC to indi- 
cate to the memory bus arbiter that the MBC needs the 
memory bus. An MBC will generate this signal regard- 
less of whether or not the MBC is currently driving the 
bus. 



MBREQ # is not issued for snoop write-back cycles. If 
the snooping core already had its MBREQ # pin assert- 
ed, the pending cycle which caused the MBREQ # is 
aborted by the snoop write-back, according to 82495 
protocol. The MBC state machines of the snooper, 
however, continue to assert MBREQ # until an internal 
time-out period has elapsed, allowing the snooping 
82495 to reissue the aborted cycle after the snoop write- 
back has completed. Therefore a core which is waiting 
for the bus can service a snoop write-back without los- 
ing its request for the bus. 

MLEN (O) - Memory bus LEN 

This signal together with MCACHE#, MW/R# and 
MKEN# determine the memory bus cycle length ac- 
cording to the following table: 



MW/R# 


MLEN 


MCACHE# 


MKEN# 


length 


Notes 


X 





1 


X 


1 


1 


X 


1 


1 


x 


2 


1 











1 


1 


2 





1 





1 


2 


2 





X 








4 




1 


X 





X 


4 





NOTES: 

1. Locked i860 XP CPU write-back cycles (length = 4), caused by the i860 XP CPU executing a FLUSH instruction during a 
LOCKed sequence, are treated as normal write cycles (length = 1 or 2 according to LEN). This is allowed since i860 XP CPU 
write-back cycles always access a 82495 modified line (in [M] state) and are only written into the 82490, without updating 
memory. 

2. MKEN# must be driven valid the clock following MADS# by the memory controller. 



MMI/0#, MD/C# (O) - Memory bus l/0# and 
D/C# 

These signals, together with MW/R#, define the mem- 
ory bus cycle, according to the i860 XP CPU Data 
Sheet. They are driven in the same MCLK as 

MADS#. 



MBE[7:0]# (O) ■ Memory bus BE[7:0]# 

The byte enable signals to the memory bus identify 
which bytes are being accessed. They are identical to 
the CPU byte enables on CPU generated cycles. For 
82495 generated cycles (write-backs and allocations) all 
MBE#s are asserted. 



MCACHE# (I/O) - Memory bus CACHE # 

In a master core MCACHE# is an output; in a snoop- 
ing core it is an input. As an output, it indicates poten- 
tially cacheable reads or a 82495 write-back. 
MCACHE# is used by the system memory together 
with MLEN, MW/R# and MKEN# to determine cy- 
cle length. As an input, MCACHE# is connected to 
the 82495 SNPNCA pin. 



MW/R# (I/O) - Memory bus W/R# 

This signal is an output for a master core, an input for a 
snooping core. As an output, it indicates whether the 
memory access is a read ar a write, and is used by the 
system memory along with MMI/0# and MD/C# to 
determine the cycle type, according to the i860 XP 
CPU Data Sheet. As an input, the signal is connected 
directly to the 82495 SNPINV pin. 



MA[31:3] (I/O) - Memory bus Address 

These are the memory bus address lines of the MBC. 
Along with the byte enable signals, they define the 
physical area of memory or I/O accesses. In a master 
MBC they are driven by the 82495 onto the memory 
bus together with MADS# (same MCLK). In a snoop- 
ing MBC, these lines are inputs to the 82495 which are 
latched by the MSNPSTB# signal. 
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MD[63:0], MDP[7:0] (I/O) 
and Data Parity 



Memory bus Data 



64 bits of data, 8 bits of parity, connected through 
transceivers to the i860 XP CPU and 82490s. When an 
MBC does not own the bus, these pins are tristated. 

XAS#/XSAS# - X Unit Address Strobe 

XAS# is generated in the X-unit (sync to CLK), and is 
synchronized and sent to the Y-unit as XS AS # . 

XAS# indicates the start of a memory bus cycle from 
the X-unit (CLK side). XAS # is generated as a result 
of a CADS # from the 82495 on a read cycle or 
CDTS# from the 82495 on a write cycle. XAS# is 
held active until the X-unit receives YSBGT#. 



YMEOC#/YSMEOC^ 
Of Cycle 



(O) MBC Memory End 



YMEOC# is generated in the Y unit* and is synchro- 
nous to MCLK, and sent to the X-unit as YSMEOC# . 
It indicates the M-bus transfer has finished, based on 
the MBC's tranfer length count. YMEOC# directly 
drives the 82490s' MEOC# inputs. YSMEOC# causes 
generation of the CRDY# signal to the 82495 and 
82490s. For non-pipelined cycles CRDY# is issued im- 
mediately after an active YSMEOC# (if CDTS# was 
issued). For pipelined cycles CRDY# is issued after 
the CRD Y# of the previous cycle (if YMEOC#, 
CDTS# of the pipelined cycle were issued). 

YMEOC# is issued at least 2 MCLKs after YBGT# 
(for every cycle). 



Memory bus Guaranteed 



YBGT#/YSBGT# 
Transfer 

YBGT# is generated in the Y-unit, and is synchroniz- 
ed and sent as YSBGT# to the X-unit. 

This signal is generated in the Y-unit after MADS# 
(the cycle has been issued on the memory bus). When 
YSBGT# arrives at the X-unit, the signal causes asser- 
tion of the 82495's BGT# input, and one clock later 
(non-pipelined cycle) the assertion of KWEND#. 
YSBGT# of a pipelined cycle (which is sampled during 
the initial cycle, i.e. before its CRDY#) Causes the 
BGT# and KWEND# of the pipelined cycle to be 
issued immediately after CRDY# of the initial cycle. 

YBGT# of a pipelined cycle cannot be issued before 
the MSWEND# of the previous cycle. This is guaran- 
teed by the M-bus protocol, which ensures that a pipe- 
lined MADS # is not issued until after the 
MSWEND # of the previous cycle. 



YCEOC #/YSCEOC# - MBC CPU End Of Cycle 

This signal is internal to the MBC: YCEOC is generat- 
ed synchronous to MCLK, and is synchronized to 
CLK to produce YSCEOC#. It indicates that the 
CPUbus transfer has finished, based on the MBCs 
tranfer length count. It generates the BRDY#s to the 
82495, 82490, CPU, and to other MBC machines. For 
non-pipelined cycles all BRDY#s except the first are 
issued immediately after an active YCEOC # (if 
CDTS# was issued). For pipelined cycles all BRDY#s 
except the first are issued after the CRDY# or the last 
BRDY# (BRDY# * CLEN1) of the previous cycle. 

YCEOC # can be issued before, with, or 1 clock after 
YMEOC#. When the line ratio is 2 or 4, YCEOC# 
precedes YMEOC# by a significant time, allowing 
CPU linefllls to complete long before the M-bus tranfer 
completes. 

YCEOC# is asserted only if RDYSRC is active 
(High). 



BGT#, KWEND# (O) - Bus Guaranteed 
Transfer, Cache Window End to 82495 

BGT# and KWEND# are generated for every cycle 
(including snoop write-backs). In a non-pipelined cycle 
BGT# is issued immediately after sampling YSBGT# 
active, and KWEND# is issued 1 clock later. In pipe- 
lined cycles, these signals are asserted after the 
CRDY# of the initial cycle. 
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BUS CYCLES 



Non-aborted Read Cycles 

Figure C-2 is a timing diagram for the memory bus 
controller executing a line fill after the i860 XP CPU 
issues a read which misses the 82495/82490. The dia- 
gram reveals a number of the signals which are internal 
to the MBC, to provide a better perspective on the tim- 
ing of events. Note that signals which begin with an M 
are MBC signals to the memory bus. Signals that begin 
with Y originate in the Y side of the MBC which is 
synchronous to MCLK, and an X denotes origin in the 
X state machines, which are synchronous to CLK. 

The i860 XP CPU microprocessor issues a read cycle in 
CLK 0, as indicated by the assertion of ADS#. The 
82495 performs the tag lookup, and finds the request a 
cache miss. In CLK 2, the 82495 issues CADS# and 
the cycle control signals, alerting the memory bus con- 
troller that a 4 transfer 82495 read is requested. 

The X side state machines, which run on the processor 
CLK, issue an XAS# on the CLK after CADS# for a 
82495 read cycle (CW/R# = 0). The XAS# signal 
passes through the synchronizer running on MCLK to 
become synchronized in two MCLKs. The synchroniz- 
ed XAS# signal, called XSAS#, is sent to the Y side of 
the MBC in MCLK 4. 

In MCLK 5, XSAS# has initiated the assertion of 
MBREQ to request the memory bus from the memory 
bus arbiter. If the bus is already owned (or once it is 
owned) by this MBC, XSAS# causes the assertion of 
MADS# to the memory bus, MAOE# to the 82495, 
and the internal YBGT# signal. The assertion of the 
82495's MAOE signal allows the 82495 to drive its ad- 
dress lines to the memory bus. YBGT# indicates that 
the memory bus is owned by this MBC, and is sent to 
the synchronizer for the X side of the MBC as well as 
many Y side state machines. 

On the Y side, YBGT# is used to deassert MBREQ #, 
to sample YALLOC# on writes, and to initiate 
MSNPSTB#. MSNPSTB# is asserted in MCLK 6 to 
request a snoop in the other MBC. YBGT# is also 
synchronized to CLK, appearing as YSBGT#, by 
CLK 9. YSBGT# causes the assertion of BGT# to the 



82495 in CLK 10, and, 1 CLK later, KWNED#. The 
MKEN# input, which must be valid to the 82495 
when KWEND # is asserted, must be driven by the 
main memory on the MCLK after MADS# for this 
implementation. These signal activities define the initia- 
tion of normal bus cycles (as opposed to snoop write- 
backs). 

In this particular example, the memory bus responds 
quickly to the read request. Here, the memory subsys- 
tem drives MNA# to the MBC in MCLK 6, and pres- 
ents data on the memory bus in MCLK 7. Since 
MBRDY# must be driven by the memory bus 1 
MCLK before data is available, MBRD Y # is asserted 
in MCLK 6, with successive MBRD Y # s on the follow- 
ing MCLKs. The YMBRDY# output of the MBC is 
the MBRD Y # signal delayed one clock, and drives the 
MBRDY# input on the 82490s to read in the incoming 
data. 

While the data transfer is occurring, the second memo- 
ry bus controller responds to the snoop request for this 
memory access in MCLK 8. Because the data is not 
present in the cache of the other core, that MBC will 
assert its MSWENDO# output with MHITMO# 
driven high. These outputs of the snooping core are tied 
directly to the MSWENDI# and MHITMI# inputs, 
respectively, of the master core in this two core imple- 
mentation. Both of these signals are passed to the 82495 
(MSWENDI# is synchronized first) as well as to the 
state machines of both sides of the MBC. The arrival of 
these signals allow the core to accept the data as valid, 
and conclude with the read operation when all of the 
data has been transferred. 

The arrival of the fourth MBRDY# generates the 
YMEOC# and YCEOC# signals in MCLK 10. 
YMEOC# drives the MEOC# input on the 82490s. In 
addition, both signals are synchronized and sent to the 
X side of the MBC. Upon the arrival of YSCEOC#, 
the X state machines begin generating BRDY#s to the 
i860 XP CPU. Upon arrival of YSMEOC#, CRDY# 
is driven to the 82495, indicating the end of the cycle. 
YMEOC# and YCEOC# are used to reset many of 
the Y side state machines, including cycle type and 
length indicators, and the drivers of 82490 signals such 
as YMALE# and YMSEL#. On the X side, the reset 
functions are triggered by CRDY# and the last 
BRDY#. 
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Figure C-2. Non-Aborted Read Cycles 
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Figure C-2. Non-Aborted Read Cycles (Continued) 
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Aborted Non-Pipelined Cycles 

Figure C-3 illustrates an aborted non-pipelined cycle. 
MHITMI# is sampled active during MSWENDI# 
(clock 4) indicating a snoop hit to a modified line. Since 
the cycle is non-pipelined, MABORT# is issued imme- 
diately and the core floats its bus (clock 5). Although 
the bus is floated by the master core, the master still 
owns the bus (MHLDA remains inactive). 

M ABORT # in clock 5 causes the main memory to 
abort its cycle regardless the number of MBRDYs that 
have been issued. MBOFF# is also asserted in clock 5 
to indicate to the snooping core that the master is float- 
ing its signals and the write-back may begin. The main 
memory floats its data bus in clock 6 in response to 
MABORT#. In the following clocks a snoop write- 
back cycle is performed by the snooper. The snooper 
will release the bus at the end of the write-back. 

Note that MSNPSTB# is riot asserted during the 
write-back cycle since it obviously will not hit any 
cache. 



Aborted Pipelined Cycles 

Figure C-4 illustrates an aborted pipelined cycle. Al- 
though MHITMI# is sampled active during 
MSWENDI# (clock 7) M ABORT # will not be issued 
immediately since the previous cycle has not been com- 
pleted yet. M ABORT # is issued in clock 9 after 



the last data slice was read into the core. The core floats 1 
its bus and asserts MBOFF# concurrently with 
MABORT#. Upon sampling MBOFF#, the snooping 
MBC begins the snoop write-back in clock 10. 



Write Allocate 

Figure C-5 illustrates a write cycle which is potentially 
allocatable. This write is performed on the bus only in 
order to sample the MKEN#, since the allocation cy- 
cle will only be guaranteed if MKEN# is active. 

MKEN# is sampled active in clock 2 causing the 
MABORT# to be issued immediately. The reason to 
abort the write cycle, even before MSWEND#, is due 
to the fact that a read for ownership cycle is guaranteed 
to be performed after the aborted write. 

In clock 4 the MADS# of the allocation cycle, which 
becomes the MADS # of the read for ownership cycle, 
is issued. This MADS# is issued only if MSWEND# 
has not been issued yet, or if MS WEND # was issued 
and MHITMI# was negated. If MHITMI# is asserted 
during the MSWEND# that was issued, MADS# will 
not be issued (since the snooper issues its MADS#). 

A second MABORT# is issued in clock 8 indicating 
the memory to abort the allocation, and the snooper to 
start flushing the modified line. Note that a second 
MABORT# will be issued regardless if MADS# of 
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Figure C-3. Aborted Non-Pipelined Cycle 
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Figure C-4. Aborted Pipelined Cycles 



the allocation was issued or not. The first M ABORT # 
(clock 3) aborts the write cycle in the memory module 
and does not affect the snooper. The second 
M ABORT # (clock 8) indicates to the snooper to start 
its write-back cycle (and if MADS# of an allocation 
was issued to also abort it in the memory module). 

MSNPSTB# is not issued for the allocation cycle since 
write and allocation cycles access the same line. 

If MKEN# had been negated in clock 2 then an alloca- 
tion would not have been performed and the write cycle 
would have continued as a non-allocatable write cycle 
(see figure C-6). 



Non-Allocatable Write 

Figure C-6 illustrates a write cycle without an alloca- 
tion. It can be either a non-potentially allocatable write 
cycle or a potentially allocatable write with inactive 
MKEN# (clock 1). 

The write cycle is aborted (M ABORT # in clock 3) 
after sampling active MHITM# during MSWEND# 
(clock 2). In clock 11 the master core re-issues the 
MADS# of the aborted write cycle (after the snoop 
write-back has been completed). MSNPSTB# will not 
be issued again since the updated data had been written 
into the main memory and the snooper has gone to the 
invalid state. 



LIMITATIONS OF DESIGN 

The primary limitation of the implementation as it has 
been presented so far is that it includes only two proces- 
sors. The protocol set up in the design is not limited to 
two processors. The next section outlines the imple- 
mentation details which must be modified to extend the 
design to more than two processors. 

The design has no support for CS8 mode, so the proces- 
sors cannot be booted from 8 bit EPROMS. Instead, 
both processors boot in 64 bit mode, which may com- 
plicate the use of the design in stand-alone systems. 

The i860 XP CPU's BERR, or Bus ERRor, input is not 
utilized in this design. The pin could be used simply as 
a non-maskable interrupt pin, but the memory bus con- 
troller as designed makes no provision to use BERR to 
correct a faulty bus access. Likewise, the parity check 
results from the i860 XP CPU's PCHK# pin are of 
little value in this design outside of testing the i860 XP 
CPU's parity functions. The MBC itself does not check 
the PCHK# output, and has no means of reissuing an 
access in case of parity error. 

The memory bus controller design here does not decode 
and utilize the i860 XP CPU INTA cycles. The INT 
pin itself is connected directly to the i860 XP CPU, 
without affecting MBC operation. 
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Figure C-5. Potentially Allocatable Write 
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Figure C-6. Non-Allocatable Write 



2-492 



AP-452 



O^iyMGMV 



The Multiprocessor Interrupt Controller (MPIC) cur- 
rently being designed by Intel is not utilized in or sup- 
ported by this memory bus controller. 

The memory bus controller's treatment of LOCKed cy- 
cles is simple but straightfoward: when the 82495 issues 
a memory access which is LOCKed (KLOCK# ac- 
tive), the MBC will not relinquish the bus until a cycle 
which is not LOCKed is issued. While this is adequate 
for simple systems, it will not suffice for dual ported 
memories, where a given block of memory can be ac- 
cessed through more than one bus. In such systems, a 
LOCK signal must be introduced to alert all possible 
simultaneous users of memory that a LOCKed access is 
in progress. 

EXTENSION OF DESIGN TO THREE 
OR MORE CPUs 

Two Processor Implementation 
Overview 

Figure C-7 presents a simplified view of the multipro- 
cessing signals for the two processor implementation. 
The basic address, data, and memory cycle control lines 
are attached to a common bus. Only the core which 
controls the bus will drive these signals, with all other 
cores floating these lines and asserting MHLDA#. 

When the bus master MBC issues a cycle, the 
MCACHE# and MW/R# cycle attributes also serve 
to drive the 82495s' SNPINV and SNPNCA inputs of 
both cores. SNPSTB# is issued by the master in the 
clock following MADS # . In reality, both cores have a 
SNPSTB# output at their Y-side state machines driv- 
ing a common line which connects to the SNPSTB# 
input of both 82495s. The core which does not own the 
bus floats its state machine driver on MHLDA, so the 
signal acts only as an input in that core. The master 
drives the SNPSTB# line, but the action of SNPSTB# 
is blocked in its own 82495 because its MAOE# signal 
is asserted. 

The results of the snoop are driven out on the snooping 
core's MTHIT# and MHITMO# outputs, and 
MSWENDO # is asserted. These signals are connected 
directly to the MHITMI#, MWB/WT#, and 
MSWENDI# inputs in the master core, respectively. 

The MBOFF# signals of the two MBCs are also con- 
nected together. During MHLDA (in a snooping 
MBC) MBOFF# is an input, and in the master it is an 
output. If the master asserts MBOFF, control of the 
data and control busses is given to the snooping MBC 
so that a snoop write-back can be performed. 

Three or More Processors 



This section gives one method of extending the design 
given here to three or more processors. The solution 



presented here assumes that no changes are made to the 
state machines as they are written for the two processor 
system. Instead, some minor glue logic is added to three 
of the signals to make the core an element in a scalable 
multiprocessing system. However, modifying the state 
machines is also a plausible solution. 

In an implementation with three or more processors, 
the primary address, data, and cycle control lines are 
still connected to a common bus, as in the two proces- 
sor version. MCACHE# and MW/R# are also uti- 
lized in. the same way as the two processor version: the 
outputs of the cores drive a common line which in turn 
also drives the 82495 SNPNCA and SNPINV inputs of 
all cores. 

The SNPSTB# signal connects directly from core to 
core in a two processor version. In an implementation 
with three or more processors, the SNPSTB# line is 
simply extended to all the processors in the system. 
Only the bus master will actually drive the line, and 
snoopers will be floating the SNPSTB# output from 
their state machines. Again, the snoop request is ig- 
nored in the master because its MAOE# is asserted. 
Similarly, the MBOFF # signal becomes a common line 
which only the master will drive and which all other 
cores will sample. 

The six signals in the upper portion of diagram C-7, 
which communicate MSWEND and the snoop results 
MHITMO# and MTHIT#, will require more glue 
logic to extend the design to three or more processors. 
The snoop results MHITMO# and MTHIT# must 
now be. considered for multiple cores when a snoop has 
been issued, and the master MBC must not sample 
these results until all snooping cores have issued their 
MSWENDO #. 

To resolve these issues, common bus lines carrying 
these signals are introduced, where all cores have out- 
puts driving these lines, and inputs to sample them. The 
characteristics of such MTHIT# and MHITM# lines 
are straightforward: the line should default to 1, and if 
any core drives one of these outputs low, the line 
should be pulled low. The MTHIT# line has the sim- 
plest solution. As shown in figure C-8, by passing the 
signal which is produced by the core through an open 
collector buffer, the buffered MTHIT#s can be tied to 
a single line which is sampled directly by all cores' 
MWB/WT# pins. The open collector buffer sinks cur- 
rent like a normal gate output to drive a logic 0, but 
instead of driving current for a logic 1, the open collec- 
tor device assumes a high impedance state for logic 1. 
Thus, if all of the cores outputs MTHIT# as 1, the 
MTHIT# line remains at a logic 1 level because of the 
pull-up resistor. If one or more cores outputs a logic 0, 
the MTHIT# line will be pulled to the logic level. 
This precisely matches the desired behavior of 
MTHIT# for the system: if any 1 or more core(s) has 
the snooped data cached, the master MWB/WT# in- 
put must be asserted low. It is important to note that 
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Figure C-7. Interprocessor Communications in Two Processor System 
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the MTHIT# output of the master is floated: because 
the 82495 MTHIT# output only changes on each new 
snoop, the value of the master MTHIT# output for the 
previous snoop would erroneously be included in decid- 
ing the level of the MTHIT# line. 

The MHITM# line follows the same principle as the 
MTHIT# line. The MHITM# signal is not floated in 
the master core, and poses the problem which floating 
MTHIT# avoids: the value of the master's last MHIT- 
MO# output is still present when the new access is 
being made. To resolve this, the inverted value of 
MHLDA is ORed with MHITMO# before going to 
the open collector buffer. The master's MHLDA is al- 
ways a 0, so the OR gate will always guarantee a 1 
being passed from the master to the MHITM# line. 
Again, if one or more of the snooping MBCs outputs a 
logic 0, the MHITM# line will properly assume a 
level. 

The open collector buffer presents an easy way to add 
new MBCs to the shared lines. The desired behavior of 
a shared MSWENDA (MSWEND All) line is different 
from the attribute lines, MTHIT# and MHITM#. 
Where the master core should sample a if any one or 
more snooping core(s) drives a on these attribute 
lines, the master core must not receive its MSWEN- 
DA indication until all cores in the system have as- 
serted their MSWENDO# output. The answer is to 



invert the MSWENDO output of each snooper, so that 
a. zero is driven onto the MSWENDA line when the 
snoop is being performed, and a one is output if the 
snoop has completed. From the MSWENDI # perspec- 
tive, MSWENDI # should not be asserted at the master 
core if any snooping core is still driving a zero on the 
MSWENDA line (is not done snooping). Therefore, the 
MSWENDA line is the opposite logic polarity of the 
actual MSWENDO # signal. The master samples 
MSWENDA after the signal passes through an invert- 
er, to recorrect the logic level. The output of each core 
is passed through inverter before going to the open col- 
lector buffer. The inverting device is a NAND gate be- 
cause the SWENDO# signal shares the problem of 
MHITM#, and must be "faked" by the master. In this 
case, instead of the last snoop's results causing the 
problem, the master's SWENDO# signal is reset to 1 
(still snooping) when the SNPSTB# line is asserted. 

Again, these simple adaptations can be implemented in 
a similar manner in the logic of the state machines. The 
MHITMO# line can be forced to a logic one or floated 
when the core is a master (after YBGT, for example). 
The MSWEND signal might be implemented as an as- 
serted-high system signal, if open collector buffers are 
used to attach new cores to the shared system bus. 
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STATE MACHINES AND SCHEMATICS 
STATE DIAGRAMS 
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YCPULEN 



2-504 



iny. 



AP-452 



ti^iyiMMr 



MRESET^ 


^^VlDLENf 


I YMSWEND* 




(enmswnd 1 


YMSWEND# 


VrcvrX 


I YMSWEND 






YMSWEND 




240957-49 




YENMSWND 







MRESET^ 


( \ YBGT# 




( ENXSAS ) 


XSAS# 


] YBGT 

Vdisxsas\ 

f^""") PXSAS = XSAS.ENXSAS.XSNPWB# 
V J PSWBAS = XSAS.ENXSAS.XSNPWB 






XSAS 


240957-50 




YENXSAS 





2-505 



iny. 



AP-452 



^iummacw 





MRESET^ _/ 1 PXSAS# + YALLOC# 




YMEOC# . YPIPE# 


^^/^DLE^S/ 




YMEOC + YPIPE \f " \ 


/enblV^ 


\ PXSAS . YALLOC 




I YDRCTM J 




/disi\ 




1 DISWND J 


l\ 


/dis2\ y/ 




1 YDRCTM L— — ' 1 




A DISWND J 






V^^^^/ YIMSWEND = MSWENDI.YALLOC#.DISWND# 


240957-51 




YIMSWND 





MRESET 



MBRDY# 




MBRDY 



240957-52 



2-506 



irrtel. 



AP-452 



PE&OMOIMIiW 











•— «v 


PXSAS* 










MRESET 
















^TdlPN^ 














PXSAS*^^- I 








/hbT^ 






PXSAS / 








1 MBREQ 








V SV 






■*"^>^ YBGT. PXSAS I 




PXSAS 




4 


k 














PXSAS* 








PXSAS X. \ 

PXSAS ^S. 








/HB2> 






/hbiY^ >^ 


^usreq\ 

MBREQ J 






[ HBASWB 
V MBREQ 






PXSAS* 


MHLDA* 1 "irr^-"" r YBGT. PXSAS* l 
V MBREQ / \ 


V^ SV A 






PXSAS* . MHLDA 

YMBREQ 


YBGT* 




240957-53 



PCTCXFR 
{/MSEL IF YMEOC* . 
WMSWND . RSTRT#> 



{/SIGNAL} = SIGNAL ASSERTED 




YME0C# 
240957-54 



2-507 



iny. 



AP-452 



G^IILIMIMKY 



(YN0PIPE+MHLDA).YMEOC 

+YPIPE.YMEOC.MWR* 

+YNOPIPE#.YPIPE#.MHLDA# 




YMADS.MWR.(YNOPIPE+MHLDA) 
+YPIPE.MWR.YMEOC 



YMEOC#.(YNOPIPE+YPIPE+MHLDA) 
+YMEOC.YPIPE.MWR 

YMDOE 



240957-55 





/^~*\ YBGT# 




MRESET, 


^^^idleV 




( YMALE J 


YNOPIPE.(MNA 
.YMADS#+WMNA) 
.WMSWND+YPIPE# 
•YMEOC 


1 Iybgt 
/CLOSEX 




( YMALE J 




YPIPE+YMEOC#.YNOPIPE# 
+YMEOC#.YNOPIPE.[(MNA 
.YMADS#+WMNA) .WMSWND] # 


240957-56 




YMEMALE 





2-508 



iny. 



AP-452 



(PISOGM/M 7 



MRESET 




MBRO.MLEN 1 .MBRDY.WMSWND.MABORT# 

+MBR1.MLEN1.WMSWND.MAB0RT* 

+MBR 1 .MLEN2.MBRDY.WMSWND.MAB0RT* 

+MBR2.MLEN2.WMSWND.MAB0RT# 

+MBR3.MLEN4.MBRDY.WMSWND.MAB0RT#.TR4 

+MBR4.MLEN4.WMSWND.MAB0RT#.TR4 

+MBR7.MLEN4.MBRDY.WMSWND.MABORT* 

+MBR8.MLEN4.WMSWND.MAB0RT* 

+YALL0C.M ABORT 

{/YMFRZ IF YALLOC.MABORT} 



240957-57 



YMEMEOC 



MRESET 



YME0C# + YPIPE.L1 




L1 = LEN#.(XLKCACHE#+MKEN#.XLRDYSRC) 
L2 = LEN.(XLKCACHE#+MKEN#.XLRDYSRC) 
L4 = XLKCACHE.(MKEN+XLRDYSRC#) 



YMEMLEN 



YMEOC 



240957-58 



2-509 



iny. 



AP-452 



[PE&OMOIMV 



MRESET^ 


J J ELSE 




YBGT.KLOCK# i 


1 YBGT.KLOCK 




.YALLOC* 


I +YALLOC 




+HBASWB* 1 


J +HBASWB 

/MEMLOCKX 
( YMLOCK J 






ELSE 


240957-59 




YMEMLOCK 





MRESET 


/ ] ELSE 
^VlDLESf 






1 YBGT.(SNPDIS+MMIO+ 




ll 


I MWR.MCACHE+MWR# 






1 .XLRDYSRC#.RFO)# 






/sNPREQX 






(msnpstbJ 


240957-60 




YMSNPSTB 





2-510 



intel. 



AP-452 



(PISOfMMf 







(YMEOC* + MBOFFI) . MBRDY* 
+ YMEOC. YPIPE. MBRDY 










/bri\t >v 


YMEOC. MBRDY ^""'l P\YMEOC . YPIPE . MBRDY N. 
YME0C * ^S^ \ JL ^v. (YMEOC* + MBOFFI) . MBRDY* N. 






/ \ ^y^ V svl y^v X jf~~*\ \ 






\/br8V'^ 




l (YMECC*+ MBOFFI). MBROY^Vy^^TXy \ 


V SV1 J YMEOC. YPIPE. MBDRY* 








V^ SW*y YMEOC . YPIPE* . MBOFFI* 




(YMSEL + MBOFFI). MBRDY V^ SNZ J \ 






/ \YMEOC. MBRDY* 




/ \ (YMEOC* + MBOFFI) . MBRDY \ 






/mBOFFI* . MBDRY >^ 




XyMEOC . YPIPE . MBDRY* \ 1 
/ YMEOC. YPIPE*. MBOFFI* \ 1 






N. ,r 




MBRDY* ( 


**fmi 


\ yTBRO^ /^BR3^\*-v / 


V SV3 


\ MBOFFI . MBRDY / 




\. TR4. MBRDY. MBOFFI / \ ) MBRDY* / 


/ /CTCEND I 


J /CTCEND l jL/ / 




"\^SV4 


X \w YMEOC . MBRDY* / / 

V MRESET + MABORT \ , 1 v / 
\ X " (MBOFFI* +TR4*). MBRDY / 
\ MBRDY X, / / 

\ \^ / ^/'YMEOC . MBRDY 






V^BR6"\ 




/BR4-X s^ 


/*\ SV2 J 








O^^^^^MBRDY — . YME ° C# - M ^VV 






MBRDY* ^S. fWH5\ ^^^^ V J 


^1 ^T YMEOC* . MBRDY* 
V SV1 / 






\SV2 ./ 






mbrdy* 240957-61 






YRDYSTR 



MRESET 


( \ ELSE 






\ YSWEHITM.MWR.YALLOC* 




PCTCXFR* j 


J +CTCDIS* 

1 +CTCDIS.(YSWEHITM+PCTCXFR) 

/restartx 

I RSTRT } 






PCTCXFR 


240957-62 




YRSTRT 





2-511 



intel. 



AP-452 



IF^IUMONlMf 




YNOPIPE.YMADS#.MNA 

+ PIPE.YMEOC.MNA.YMADS# 



YMADS#.YMEOC#.YNOPIPE 

YWMNA 



240957-63 



MRESET 




^vtdleTv 




YMEOC.YPIPE#.(PCTCXFR# / ^T*r 1 
+YALL0C#)+YME0C.YPIPE | LLbt ' 
.SNPDIS#.MMIO.(MWR 1 
.MCACHE)#.(YMSWEND# \ 
+ENMSWND)# \ 1 


YBGT.(SNPDIS+MMIO#+MWR 
.MCACHE)+YNOPIPE.SNPDIS# 
.MMIO.(MWR.MCACHE)#.YMSWEND 
+YALLOC.YMSWEND.ENMSWND 


/dneswmA 




I WMSWND J 




I ELSE 
YMEOC 1 


YPIPE.(SNPDIS+MMIO#+MWR 
| .MCACHE).YMEOC#+YPIPE 
| .SNPDIS#.MMIO.(MWR 

.MCACHE)#.YMSWEND 




.ENMSWND 


/twoswndx 




1 WMSWND ) 




Vsv 7 




YMEOC* 




YSWEHITMI = YMSWEND.MHITMI.ENMSWND.(YAl 


LOC#+YNOPIPE#.YPIPE#) 

240957-64 


YWMSWNI 


3 



2-512 



irriel« 



AP-452 



IPli&OMOMM? 



PLD CODES 























_ — ___ — __ Declaration Segment .._ 


TITLE AYMBTRCK 






PATTERN 


A 






REVISION 


2.0 






AUTHOR 


ISIC SILAS 






COMPANY 


INTEL 






DATE 




2/4/91 






CHIP 


xOl 85C22vlO 










' 


rhis PLD contains the YMBTRCK state machine. 

________ DTMn^.,1^^-^*--:^^^ 




PIN 


1 


MCLK 


COMBINATORIAL ; 




PIN 


2 


MRESET 


COMBINATORIAL ; 




PIN 


3 


/WMSWND COMBINATORIAL ; 




PIN 


4 


/MBOFFI COMBINATORIAL ; 




PIN 


5 


/PXSAS 


COMBINATORIAL ; 




PIN 


6 


/PSWBAS COMBINATORIAL ; 




PIN 


7 


MHOLD 


COMBINATORIAL ; 




PIN 


8 


/MNA 


COMBINATORIAL ; 




PIN 


9 


/WMNA 


COMBINATORIAL ; 




PIN 


10 


/YMLOCK COMBINATORIAL ; 




PIN 


11 


/YSWEHITM COMBINATORIAL ; 




PIN 


12 


GND 






PIN 


13 


/PCTCXFR COMBINATORIAL ; 




PIN 


14 


/RSTRT 


COMBINATORIAL ; 




PIN 


15 


/YMEOC 


COMBINATORIAL ; 




PIN 


16 


UNUSED 


registered ; 




PIN 


17 


/YBGT 


registered ; 




PIN 


18 


/YMADS 


registered ; 




PIN 


19 


/MAOE 


registered ; 




PIN 


20 


/YNOPIPE registered ; 




PIN 


21 


/YMSTR 


registered ; • 




PIN 


22 


/YPIPE 


registered ; 




PIN 


23 


/YMSEL 


registered ; 




PIN 


24 


VCC 












Boolean Equation Segment 




EQUATIONS 






YNOPIPE 


:= /MRESET 


* PXSAS * /MHOLD * YMEOC * YNOPIPE 








+ /MRESET * 


PXSAS * YMLOCK * YMEOC * YNOPIPE 








+ /MRESET * 


/PXSAS * /YMEOC * /YSWEHITM * YNOPIPE 








+ /MRESET * 


/YMEOC * /WMSWND * /YSWEHITM * YNOPIPE 








+ /MRESET * 


YMEOC * /YSWEHITM * /PCTCXFR * YPIPE 








+ /MRESET * 


/PCTCXFR * RSTRT * YMSTR * /MAOE 








+ /MRESET ■* 


/MNA * /WMNA * /YMEOC * /YSWEHITM * YNOPIPE 








+ /MRESET * 


MHOLD * /YMLOCK * /YMEOC * /YSWEHITM * YNOPIPE 








+ /MRESET * 


PXSAS * /MHOLD * /PCTCXFR * YMSTR * /MAOE 








+ /MRESET * 


PXSAS * YMLOCK * /PCTCXFR * YMSTR * /MAOE 








+ /MRESET * 


PXSAS * /MHOLD * /MBOFFI * /YMSTR * /MAOE 








+ /MRESET * 


PXSAS * /MHOLD * /YSWEHITM * /PCTCXFR * /YNOPIPE * /YPIPE 








* YMSTR 










+ /MRESET * 


PXSAS * YMLOCK * /YSWEHITM * /PCTCXFR * /YNOPIPE * /YPIPE 








* YMSTR 






YPIPE 


:« /MRESET 


* /YMEOC * YPIPE 








+ /MRESET * 


PXSAS * /MHOLD * MNA * /YMEOC * WMSWND * /YSWEHITM 








* YNOPIPE 










.+ /MRESET * 


PXSAS * /MHOLD * WMNA *. /YMEOC * WMSWND * /YSWEHITM 








* YNOPIPE 










+ /MRESET * 


PXSAS * MNA * YMLOCK * /YMEOC * WMSWND * /YSWEHITM 








* YNOPIPE 










+ /MRESET * 


PXSAS * WMNA * YMLOCK * /YMEOC * WMSWND * /YSWEHITM 








* YNOPIPE 






YMSTR 


:= /MRESET 


* YPIPE 










240957- 


-65 



2-513 



Intel. 



AP-452 



(pRiyfijiDNARnf 



MAOE 



YMADS 



YBGT 



/MRESET * /YMEOC * YNOPIPE 
/MRESET * YSWEHITM * YNOPIPE 
/MRESET * /MHOLD * YMSTR 
/MRESET * YMLOCK * YMSTR 
/MRESET * /MHOLD * /MBOFFI * /MAOE 
/MRESET * PCTCXFR * YMSTR * /MAOE 
/MRESET * RSTRT * YMSTR * /MAOE 



- /MRESET 
/MRESET * 
/MRESET 
/MRESET 
/MRESET 
/MRESET 
/MRESET 
/MRESET 
/MRESET 
/MRESET 
/MRESET 
/MRESET 
/MRESET 
/MRESET 



* /PCTCXFR * RSTRT * YMSTR * /MAOE 

/YMEOC * YPIPE 

YMLOCK * YMEOC * YNOPIPE 

/YMEOC * /YSWEHITM * YNOPIPE 

/YSWEHITM * /PCTCXFR * YPIPE 

/YMEOC * /YMSTR * MAOE 

YMLOCK * /YSWEHITM * /PCTCXFR * YMSTR 

/MHOLD * /PCTCXFR * YMSTR * /MAOE 

YMLOCK * /PCTCXFR * YMSTR ■*■ /MAOE 

/MHOLD * /MBOFFI * /YMSTR * /MAOE 

PSWBAS * MBOFFI * /YMSTR * /MAOE 

/MHOLD * /YMLOCK * /YMEOC * /YNOPIPE * MAOE 

/MHOLD * /YMLOCK * YMEOC * /YPIPE * YMSTR * MAOE 

/PXSAS * YMLOCK * /YNOPIPE .* /YPIPE * YMSTR * MAOE 



/MRESET 
/MRESET * 
/MRESET * 
/MRESET * 
/MRESET * 
/MRESET * 
/MRESET * 

* YMSTR 

+ /MRESET 

* YMSTR 
+ /MRESET 
+ /MRESET 
+ /MRESET 
+ /MRESET 
+ /MRESET 



* PXSAS * /MHOLD 
PXSAS * YMLOCK * 
/PCTCXFR * RSTRT 
PXSAS * /MHOLD * 
PXSAS * YMLOCK * 
PXSAS * /MHOLD * 



+ /MRESET 
/MRESET 
/MRESET 
/MRESET 
/MRESET 
/MRESET 
/MRESET 



* YMEOC * YNOPIPE 
YMEOC* YNOPIPE 

* YMSTR * /MAOE 
/PCTCXFR * YMSTR * /MAOE 
/PCTCXFR * YMSTR * /MAOE 
/MBOFFI * /YMSTR * /MAOE 



PXSAS * /MHOLD * /YSWEHITM * /PCTCXFR * /YNOPIPE * /YPIPE 



* PXSAS * YMLOCK * /YSWEHITM * /PCTCXFR * /YNOPIPE * /YPIPE 



= /MRESET 
/MRESET * 
/MRESET * 
/MRESET * 
/MRESET * 



* YMSTR * MAOE 



PSWBAS * MBOFFI * /YMSTR * /MAOE 

PXSAS * /MHOLD * MNA * WMSWND * /YSWEHITM * YNOPIPE 
PXSAS * /MHOLD * WMNA * WMSWND * /YSWEHITM * YNOPIPE 
PXSAS * MNA * YMLOCK * WMSWND * /YSWEHITM * YNOPIPE 
PXSAS * WMNA * YMLOCK * WMSWND * /YSWEHITM * YNOPIPE 

* PXSAS * /MHOLD * YMEOC * YNOPIPE 

PXSAS * YMLOCK * YMEOC * YNOPIPE 

PXSAS * /MHOLD * /MBOFFI * /YMSTR * /MAOE 

PSWBAS * MBOFFI * /YMSTR * /MAOE 

PXSAS * /MHOLD * MNA * WMSWND * /YSWEHITM * YNOPIPE 

PXSAS * /MHOLD * WMNA * WMSWND * /YSWEHITM * YNOPIPE 

PXSAS * MNA * YMLOCK * WMSWND * /YSWEHITM * YNOPIPE 

PXSAS * WMNA * YMLOCK * WMSWND * /YSWEHITM * YNOPIPE 

PXSAS * YMLOCK * /YNOPIPE * /YPIPE * YMSTR * MAOE 

PXSAS * /MHOLD * /PCTCXFR * /RSTRT * YMSTR * /MAOE 

PXSAS * YMLOCK * /PCTCXFR * /RSTRT * YMSTR * /MAOE 

PXSAS * /MHOLD * /YSWEHITM * /PCTCXFR * /YNOPIPE * /YPIPE 



YMSEL := /MRESET * /YMEOC * YPIPE 

+ /MRESET * /YMEOC * /YSWEHITM * YNOPIPE 

+ /MRESET * /YSWEHITM * /PCTCXFR * YPIPE 

+ /MRESET * /YMEOC * WMSWND * PCTCXFR * /RSTRT * YMSTR * /MAOE 



UNUSED 



:= VCC 



240957-66 
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Declare 


a f"i r\Y\ Qf^crm^T^t" ------------ 






1L1UU JCEUICUL ------------ 


TITLE AYMEMLEN 




PATTERN 


A 




REVISION 


2.0 




AUTHOR 


ISIC SILAS 




COMPANY 


INTEL 




DATE 




2/4/91 




CHIP 


xOl 85C22V10 










rhis PLD contains the YMEMLEN and YCPUEOC state machines 










- — ------ — — -. — — «.»__.-.,..-■,_...._ "PT M r^or»ln>*ist"'i/^nc — --.. — .... — _- — — — .• — 




PIN 


1 


Illi. L/cOJ.cll.ctLXLlilo --------------- 

MCLK COMBINATORIAL ; 




PIN 


2 


MRESET COMBINATORIAL 






PIN 


3 


/YMSEL COMBINATORIAL 






PIN 


4 


/YPIPE COMBINATORIAL 






PIN 


5 


/MABORT COMBINATORIAL ; 




PIN 


6 


/MBRDY COMBINATORIAL ; 




PIN 


7 


/WMSWND COMBINATORIAL ; 




PIN 


8 


XLRDYSRC COMBINATORIAL ; 




PIN 


9 


/XLKCACHE COMBINATORIAL ; 




PIN 


10 


/MKEN COMBINATORIAL ; 




PIN 


11 


LEN COMBINATORIAL ; 




PIN 


12 


GND ; 




PIN 


13 


/CACHE COMBINATORIAL ; INPUT 




PIN 


14 


/YMEOC COMBINATORIAL ; INPUT 




PIN 


15 


/SVRO COMBINATORIAL 


INPUT 




PIN 


16 


/SVR1 COMBINATORIAL 


INPUT 




PIN 


17 


/SVR2 COMBINATORIAL 


INPUT 




PIN 


18 


/SVR3 COMBINATORIAL 


INPUT 




PIN 


19 


/YCEOC registered ; 




PIN 


20 


/SVLO registered ; 




PIN 


21 


/SVL1 registered ; 




PIN 


22 


/SVCO registered ; 




PIN 


23 


/SVC1 registered ; 




PIN 


24 


VCC 








Boolean Equation Segment 




EQUATIONS 




SVL1 




:= /YMEOC * /MRESET * SVL1 

+ YPIPE * LEN * /XLKCACHE * /MRESET * SVL1 

+ YMSEL * LEN * /MRESET * /SVL1 * /SVLO 

+ YPIPE * LEN * XLRDYSRC * /MKEN * /MRESET * SVL1 

+ YPIPE * YMEOC * LEN * /XLKCACHE * /MRESET * SVLO 

+ YMSEL * /XLRDYSRC * XLKCACHE * /MRESET * /SVL1 * /SVLO 

+ YMSEL * XLKCACHE * MKEN * /MRESET * /SVL1 * /SVLO 

+ YPIPE * YMEOC * LEN * XLRDYSRC * /MKEN * /MRESET * SVLO 




SVLO 




:= YMSEL * /XLRDYSRC * XLKCACHE * /MRESET * /SVL1 * /SVLO 
+ YMSEL * XLKCACHE * MKEN * /MRESET * /SVL1 * /SVLO 
+ /YMEOC * /MRESET * SVLO 

+ YPIPE * /LEN * /XLKCACHE * /MRESET * SVLO 
+ YMSEL * /LEN * /MRESET * /SVL1 * /SVLO 
+ YPIPE * YMEOC * /LEN * /XLKCACHE * /MRESET * SVL1 
+ YPIPE * /LEN * XLRDYSRC * /MKEN * /MRESET * SVLO 
+ YPIPE * YMEOC * /LEN * XLRDYSRC * /MKEN * /MRESET * SVL1 




SVC1 




= /YMEOC * /YCEOC * /MRESET * SVC1 
+ YMSEL * XLRDYSRC * /MRESET * /SVC1 * /SVCO 
+ YPIPE * YMEOC * /CACHE * XLRDYSRC * /MRESET * SVC1 
+ YPIPE * YMEOC * /CACHE * XLRDYSRC * /MRESET * SVCO 




SVCO 




•= /YMEOC * /MRESET * SVCO 

+ /YMEOC * YCEOC * /MRESET * SVCl 
.+ YPIPE * /XLRDYSRC * /MRESET * SVCO 


240957-67 
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iny. 
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PRBBJIBODHIAIRV 



+ YPIPE * YMEOC * /XLRDYSRC * /MRESET * SVCl 

+ YMSEL * CACHE * /MRESET * /SVC1 * /SVCO 

+ YMSEL * /XLRDYSRC * /MRESET * /SVC1 * /SVCO 

YCEOC := SVR3 * /SVR2 * /SVR1 * SVL1 * SVLO * SVC1 * WMSWND 

* /MABORT * /MRESET * /YCEOC 

+ SVR3 * /SVR1 * /SVRO * SVL1 * SVLO * SVC1 * WMSWND 

* /MABORT * /MRESET * /YCEOC 

+ /SVR3 * /SVR2 * SVR1 * /SVRO * SVLl * /SVLO * SVC1 

* WMSWND * /MABORT * /MRESET * /YCEOC 

+ /SVR3 * /SVR2 * /SVR1 * SVRO * /SVLl * SVLO * SVC1 

* WMSWND * /MABORT * /MRESET * /YCEOC 

+ /SVR3 * SVR2 * SVR1 * /SVRO * SVLl * SVLO * SVC1 

* WMSWND * /MABORT * /MRESET * /YCEOC 

+ /SVR3 * /SVR2 * SVR1 * SVRO * SVLl * SVLO * SVCl 

* WMSWND '* /MABORT * /MRESET * /YCEOC 

+ /SVR3 * /SVR2 * SVR1 * /SVRO * SVLl * SVCl * /SVCO 
.* WMSWND * /MABORT * /MRESET * /YCEOC 
+ /SVR3 * SVR2 * /SVRO * SVLl * SVLO * SVCl * /SVCO 

* WMSWND * /MABORT * /MRESET * /YCEOC 

+ /SVR3 * /SVR2 * /SVR1 * /SVLl * SVLO * SVCl* MBRDY 

* WMSWND * /MABORT * /MRESET * /YCEOC 

+ /SVR3 * SVR2 * /SVRO * SVLl * SVLO * SVCl .* MBRDY 

* WMSWND * /MABORT * /MRESET * /YCEOC 

+ /SVR3 * /SVR2 * /SVR1 * SVRO * SVLl * /SVLO * SVCl 

* MBRDY * WMSWND * /MABORT * /MRESET * /YCEOC 

+ /SVR3 * /SVR2 * /SVR1 * SVRO * SVLl * SVCl * /SVCO 

* MBRDY * WMSWND * /MABORT * /MRESET * /YCEOC 

240957-68 
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Declaration Segment 












TITLE BYRDYSTR 








PATTERN 


A 










REVISION 


2.0 








AUTHOR 


ISIC SILAS 








COMPANY 


INTEL 








DATE 




2/4/91 








CHIP 


xOl 85C22V10 












This PLD contains the YRDYSTR, YRDYSTR, and YMEMEOC state machines. 










.__*__- — _____..____---- — -_ PTM HqpI ayafi nnc - - 








PIN 


1 




1~J.1N L/cJC J-cil. ci L J_UIli> 

MCLK COMBINATORIAL ; 








PIN 


2 




MRESET COMBINATORIAL ; 








PIN 


3 




TR4 COMBINATORIAL ; 








PIN 


4 




/YALLOC COMBINATORIAL ; 








;PIN 


5 


#### 








PIN 


6 




/MABORT COMBINATORIAL ; 








PIN 


7 




/MBRDY COMBINATORIAL ; 








PIN 


8 




/WMSWND COMBINATORIAL ; 








PIN 


9 




/MBOFFI COMBINATORIAL ; 








PIN 


10 




/YMSEL COMBINATORIAL ; 








PIN 


11 




/YPIPE COMBINATORIAL ; 








PIN 


12 




GND 








PIN 


13 




/SVL1 COMBINATORIAL ; INPUT 








PIN 


14 




/SVLO COMBINATORIAL ; INPUT 








PIN 


15 




/SVR3 registered ; 








PIN 


16 




/SVR2 registered ; 








PIN 


17 




/SVR1 registered ; 








PIN 


18 




/SVRO registered ; 








PIN 


19 




/YMEOC1 registered ; 








PIN 


20 




/YMEOC registered ; 








PIN 


21 




/YMFRZ registered ; 








PIN 


22 




/CTCEND registered ; 








PIN 


23 




/SV registered ; 








PIN 


24 




VCC ; 

_----- — -. — -- — - — ------- — — - R *"k r* 1 o a ri P , nii£at"""i^kTi C 
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EQUATIONS 










SVR3 




+ 
+ 
+ 

+ 
+ 


/MRESET * /YMEOC * /MBRDY * /MABORT * SVR3 
/MRESET * MBRDY * /MABORT * /MBOFFI * SVR2 
/MRESET * /MBRDY * /MABORT * SVR3 * SVR2 
/MRESET * MBRDY * /MABORT * SVR2 * SVR1 
/MRESET * /YMEOC * /MABORT * SVR3 * SVRO 
/MRESET * MBRDY * /MABORT * /TR4 * /SVR3 * 


SVR2 






SVR2 




+ 
+ 
+ 

+ 


/MRESET * /MBRDY * /MABORT * SVR2 
/MRESET * /MABORT * SVR2 * SVR1 
/MRESET * /YMEOC * MBRDY * /MABORT * SVR1 
/MRESET * MBRDY * /MABORT * MBOFFI * SVR1 
/MRESET * MBRDY * /MABORT * SVR1 * SVRO 








SVR1 




+ 
+ 
+ 
+ 
+ 


/MRESET * /MABORT * SVR1 * SVRO 
/MRESET * /YMEOC * /MBRDY * /MABORT * SVR1 
/MRESET * /MBRDY * /MABORT * MBOFFI * SVRl 
/MRESET * /MBRDY * /MABORT * SVR2 * SVRl 
/MRESET * /YMEOC * MBRDY * /MABORT * /SVR3 
/MRESET * MBRDY * /MABORT * MBOFFI * /SVR3 


* SVRO 

* SVRO 










+ 


/MRESET * /YMEOC * MBRDY * /MABORT * SVR3 * 


/SVR2 * 


/SVRO 




SVRO 




+ 

+ 


/MRESET * MBRDY * /MABORT * /MBOFFI * SVR3 
/MRESET * MBRDY * /MABORT * SVR3 * /SVR2 
/MRESET * /YMEOC * /MBRDY * /MABORT * SVRO 
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+ /MRESET * /MBRDY * /MABORT * SVRl * SVRO 




+ /MRESET * /MBRDY * /MABORT * MBOFFI * /SVR3 * SVRO 




+ /MRESET * YPIPE * YMEOC * MBRDY * /MABORT * /SVRl * SVRO 




+ /MRESET * YMSEL * MBRDY * /MABORT * /SVR2 * /SVR1 * /SVRO 




+ /MRESET * MBRDY* /MABORT * MBOFFI * /SVR2 * /SVRl * /SVRO 




+ /MRESET * YPIPE * YMEOC * MBRDY * /MABORT * /SVR2 * SVRl 




* /SVRO 




CTCEND - . /MRESET * MBRDY * /MABORT * MBOFFI * SVR3 * SVR2 




+ /MRESET * MBRDY * /MABORT * MBOFFI * TR4 * SVR2 * /SVRl 




YMEOC = /MRESET * MABORT * YALLOC * /YMEOC * /SV 




+ /MRESET * /SVR3 * /SVR2 * SVRl * /SVRO *. SVLl * /SVLO 




* WMSWND * /MABORT * /YMEOC * /SV 




+ /MRESET * /SVR3 * /SVR2 * /SVRl * SVRO * /SVL1 * SVLO 




* WMSWND * /MABORT * /YMEOC * /SV 




+ /MRESET * SVR3 * /SVR2 * /SVRl * SVRO * SVL1 * SVLO 




* WMSWND * /MABORT * /YMEOC * /SV 




+ /MRESET * SVR3 * /SVR2 * /SVRl * SVLl * SVLO * TR4 * WMSWND 




* /MABORT * /YMEOC * /S V 




+ /MRESET * /SVR3 * /SVR2 * /SVRl * /SVLl * SVLO * MBRDY 




* WMSWND * /MABORT * /YMEOC * /SV 




+ /MRESET * /SVR3 * /SVR2 .* /SVRl * SVRO * SVL1 * /SVLO 




* MBRDY * WMSWND * /MABORT * /YMEOC * /SV 




+ /MRESET * SVR3 * SVR2 * /SVRl * /SVRO * SVL1 * SVLO 




* MBRDY * WMSWND * /MABORT * /YMEOC * /SV 




+ /MRESET * SVR2 * /SVRl * /SVRO * SVL1 * SVLO * TR4 * MBRDY 




* WMSWND * /MABORT * /YMEOC * /SV 




SV = /MRESET * YMEOC 




/YMFRZ = MRESET 




+ /MABORT 




+ /YALLOC 




+ YMEOC 




+ SV 




YMEOC1 = /MRESET * MABORT * YALLOC * /YMEOC1 * /SV 




+ /MRESET * /SVR3 * /SVR2 * SVRl * /SVRO * SVL1 * /SVLO 




* WMSWND * /MABORT * /YMEOCl * /SV 




+ /MRESET * /SVR3 * /SVR2 * /SVRl * SVRO * /SVLl * SVLO 




* WMSWND * /MABORT * /YMEOCl * /SV 




+ /MRESET * SVR3 * /SVR2 * /SVRl * SVRO * SVL1 * SVLO 




* WMSWND * /MABORT * /YMEOCl * /SV 




+ /MRESET * SVR3 * /SVR2 * /SVRl * SVL1 * SVLO * TR4 * WMSWND 




* /MABORT * /YMEOCl * /SV 




+ /MRESET * /SVR3 * /SVR2 * /SVRl * /SVL1 * SVLO * MBRDY 




* WMSWND * /MABORT * /YMEOCl * /SV 




+ /MRESET * /SVR3 * /SVR2 * /SVRl * SVRO * SVLl * /SVLO 




*. MBRDY * WMSWND * /MABORT * /YMEOCl * /SV 




+ /MRESET * SVR3 * SVR2 * /SVRl * /SVRO * SVLl * SVLO 




* MBRDY * WMSWND * /MABORT * /YMEOCl * /SV 




+ /MRESET * SVR2 * /SVRl * /SVRO * SVLl * SVLO * TR4 * MBRDY 




* WMSWND * /MABORT * /YMEOCl * /SV 
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TITLE 


EABORT 






PATTERN A 








REVISION 2. 









AUTHOR ISIC SILAS 






COMPANY INTEL 






DATE 


2/5/91 






CHIP 


xOl 85C224 










This PLD contains the YABORT, YRSTRT, and YMEMDOE state machines. 












..........._. lP"in Hof ini fi nnc ««..._•._•..__•_.•..•_.— 




PIN 


1 


MCLK 


nil ucimiLiuiiD -.----_-_--«--_------_-- 




PIN 


2 


MRESET 






PIN 


3 


WMSWND 






PIN 


4 


YSWEHITM 






PIN 


5 


YALLOC 






PIN 


6 


YPIPE 






PIN 


7 


YNOPIPE 






PIN 


8 


YMEOC 






PIN 


9 


MHITMI 






PIN 


10 


MAOE 






PIN 


11 


CTCEND 






PIN 


13 


MKEN 






PIN 


14 


MHLDA 






PIN 


15 


YWR 






PIN 


16 


YMADS 






PIN 


23 


CTCDIS 






PIN 


18 


RSTRT 






PIN 


19 


YMDOE 






PIN 


20 


PCTCXFR 






PIN 


21 


TRIABORT 






PIN 


22 


MABORT 






PIN 


17 


SV 


; Swapped pins 23 and 17 to fit 85C224 




EQUATIONS 








/RSTRT.D := 


/MRESET * 


/PCTCXFR * /CTCDIS 






+ 


/MRESET * 


/PCTCXFR * /RSTRT 






+ 


/MRESET * 


/YSWEHITM * /CTCDIS * RSTRT 






+ 


/MRESET * 


/YSWEHITM * YWR * YALLOC * RSTRT 




RSTRT 


CLKF 


- MCLK 






RSTRT 


RSTF 


- GND 






RSTRT 


SETF 


- GND 






RSTRT 


TRST 


- vcc 






/YMDOE.D :- 


/MRESET * 


YWR * /YPIPE * /YMEOC 






+ 


/MRESET * 


/YNOPIPE * YMEOC * /YMDOE 






+ 


/MRESET * 


MHLDA * YMEOC * /YMDOE ' 






+ 


/MRESET * 


/YPIPE * YMEOC * /YMDOE 






+ 


/MRESET * 


/YMADS * YWR * /YNOPIPE * YMDOE 






+ 


/MRESET * 


/YMADS * YWR * MHLDA * YMDOE 




YMDOE 


CLKF 


- MCLK 






YMDOE 


RSTF 


= GND 






YMDOE 


SETF 


= GND 






YMDOE 


TRST 


- VCC 






/PCTCXFR.D 


:= /MRESET 


* YALLOC * /MABORT 








+ /MRESET 


* /YSWEHITM * /MAOE * PCTCXFR 








+ /MRESET 


* /MHITMI * /WMSWND * /MABORT 








+ /MRESET 


* CTCEND * /PCTCXFR * MABORT 








+ /MRESET 


* /PCTCXFR * MABORT * /SV 
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+ /MRESET ■* /YSWEHITM * /YPIPE * YMEOC * PCTCXFR 




+ /MRESET * /YPIPE * /YMEOC * /YALLOC * PCTCXFR 




+ /MRESET -< 


< /YNOPIPE * YMEOC * /YALLOC * /MKEN * PCTCXFR 




PCTCXFR. CLKF = MCLK 






PCTCXFR. RSTF - GND 






PCTCXFR. SETF = GND 






PCTCXFR. TRST = VCC 






/TRIABORT.D := /MRESET 


* /YPIPE * /YMEOC * /YALLOC * PCTCXFR * /MHLDA 




+ /MRESET 


* /YNOPIPE * YMEOC * /YALLOC * /MKEN * PCTCXFR 




* 


/MHLDA 




+ /MRESET 


* /YSWEHITM * /MAOE * YPIPE * PCTCXFR * /MHLDA 




+ /MRESET 


* /YSWEHITM * /MAOE * /YMEOC * PCTCXFR * /MHLDA 




+ /MRESET 


* /YMEOC * /PCTCXFR * MABORT * /SV * /MHLDA 




TRIABORT . CLKF = MCLK 






TRIABORT.RSTF = GND 






TRIABORT. SETF = GND 






TRIABORT. TRST = /MHLDA 






/MABORT.D := /MRESET * 


/YPIPE * /YMEOC * /YALLOC * PCTCXFR 




+ /MRESET * 


/YNOPIPE * YMEOC * /YALLOC * /MKEN * PCTCXFR 




+ /MRESET * 


/YSWEHITM * /MAOE * YPIPE * PCTCXFR 




+ /MRESET * 


/YSWEHITM * /MAOE * /YMEOC * PCTCXFR 




+ /MRESET * 


/YMEOC * /PCTCXFR * MABORT * /SV 




MABORT.CLKF = MCLK 






MABORT. RSTF - GND 






MABORT.SETF = GND 






.MABORT. TRST = VCC 






/SV.D := MABORT * /SV 






+ PCTCXFR 






SV . CLKF = MCLK 






SV.RSTF = GND 






SV.SETF = GND 






SV.TRST = VCC 




240957-72 



2-520 



iny. 



AP-452 



^HUMMM? 



TITLE EASTB 
PATTERN A 
REVISION 2.0 
AUTHOR ISIC SILAS 
COMPANY INTEL 
DATE 2/4/91 



Declaration Segment 



CHIP xOl 85C224 



This PLD contains the XASTB, XSTFAIL, and XDTSTRCK state machines 



Pin Declarations 



PIN 


1 


CLK 


PIN 


2 


RESET 


PIN 


3 


CADS 


PIN 


4 


CDTS 


PIN 


5 


SNPADS 


PIN 


6 


CWR 


PIN 


7 


YSBGT 


PIN 


8 


CRDY 


PIN 


9 


CAHOLD 


PIN 


10 


FSIOUT 


PIN 


11 


SLFTST 


PIN 


13 


OEx 


PIN 


14 


RDYSRC 


PIN 


16 


LRDYSRC 


PIN 


17 


SV2 


PIN 


18 


STFAIL 


PIN 


19 


SV1 


PIN 


20 


XSNPWB 


PIN 


21 


WSDTS 


PIN 


22 


XAS 



; OE control inverted during design conversion. 
EQUATIONS 

LRDYSRC. D :- RDYSRC 
LRDYSRC. CLKF - CLK 
LRDYSRC. RSTF - GND 
LRDYSRC. SETF - GND 
/LRDYSRC . TRST - OEx 

/SV2.D :=- /RESET * FSIOUT * /SLFTST * STFAIL * SV2 
SV2.CLKF - CLK 
SV2.RSTF - GND 
SV2.SETF - GND 
/SV2.TRST - OEx 



/STFAIL. D :- /RESET 
STFAIL. CLKF - CLK 
STFAIL. RSTF - GND 
STFAIL. SETF - GND 
/STFAIL. TRST - OEx 



* FSIOUT * /CAHOLD * /SLFTST * STFAIL * SV2 



/SV1.D :- /RESET * 

+ /RESET * 

+ /RESET * 

+ /RESET * 

SV1.CLKF - CLK 

SV1.RSTF - GND 



CDTS * CRDY * /SV1 

CDTS * WSDTS * /SV1 

/SNPADS * XSNPWB * SV1 

/CDTS * CRDY * /WSDTS * XSNPWB 
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SV1.SETF = GND 
/SV1.TRST - OEx 

/XSNPWB. D := /RESET * CRDY * /XSNPWB 

+ /RESET * /CDTS * WSDTS * /SV1 
XSNPWB. CLKF - CLK 
XSNPWB. RSTF - GND 
XSNPWB. SETF - GND 
/XSNPWB. TRST - OEx 



/WSDTS. D := /RESET 


* CRDY * ./XSNPWB 


+ /RESET 


* /CDTS * XSNPWB 


+ /RESET 


* /WSDTS * /SV1 


+ /RESET 


* SNPADS * CRDY * /WS 


WSDTS. CLKF - CLK 




WSDTS. RSTF - GND 




WSDTS. SETF - GND 




/WSDTS. TRST - OEx 




/XAS.D :» /RESET * 


SNPADS * YSBGT * /XAS 


+ /RESET * 


/CDTS * CWR * XAS 


+ /RESET * 


/CADS * /CWR * XAS 


XAS.CLKF - CLK 




XAS.RSTF - GND 




XAS.SETF - GND 




/XAS.TRST - OEx 





2-522 



inlet. 



AP-452 



PG&IMNlMf 



TITLE 


EBGTKWN 


PATTERN 




REVISION 


1.0 


COMPANY 


INTEL 


DATE 




CHIP INTEL 


85C224 


; This 


PLD contains the XBGTKW 

Pi 



Declaration Segment 



PIN 


1 


CLK 


PIN 


2 


RESET 


PIN 


3 


YSBGT 


PIN 


4 


CRDY 


PIN 


5 


C8LDRV 


PIN 


6 


TR4 


PIN 


7 


NC5 


PIN 


8 


NC6 


PIN 


9 


WCPLB 


PIN 


10 


CNADIS 


PIN 


11 


NCI 


PIN 


13 


OE 


PIN 


14 


NC2 


PIN 


15 


NC3 


PIN 


23 


NC4 


PIN 


16 


CKENLC 


PIN 


17 


ENBGT 


PIN 


18 


CNA 


PIN 


19 


PBGT 


PIN 


20 


KWEND 


PIN 


21 


C5BGT 


PIN 


22 


BGT 


EQUATIONS 





Pin Declarations - 



/CKENLC. D := /RESET * YSBGT * 
+ /RESET * CRDY * 
CKENLC. CLKF = CLK 
CKENLC. RSTF = GND 
CKENLC. SETF - GND 
/CKENLC. TRST = OE 



CRDY 
ENBGT 



/BGT * /KWEND 
/BGT * /KWEND 



ENBGT. D := /RESET * /YSBGT 
ENBGT. CLKF = CLK 
ENBGT. RSTF = GND 
ENBGT. SETF = GND 
/ENBGT. TRST = OE 



/CNA.D := /RESET * CRDY * /CNA 

/YSBGT * WCPLB * CNADIS 
/YSBGT * WCPLB * /CNA 
/PBGT * WCPLB * /CNA 
/BGT * WCPLB * CNADIS * 





+ 


/RESET 


* 




+ 


/RESET 


* 




+ 


/RESET 


* 




+ 


/RESET 


* 


CNA 


CLKF = 


= CLK 




CNA 


RSTF = 


= GND 




CNA. 


SETF = 


= GND 




/CNA. TRST 


= OE 





CNA 



/PBGT.D := /RESET * CRDY * /PBGT 
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+ /RESET * /YSBGT * CRDY * 


/ENBGT. * 


/BGT * /KWEND 




PBGT.CLKF - CLK 








PBGT.RSTF = GND 








PBGT.SETF = GND 








/PBGT. TRST - OE 








/KWEND. D := /RESET * /BGT * KWEND 








+ /RESET * /CRDY * /PBGT 








+ /RESET * YSBGT * CRDY* 


/BGT 






+ /RESET * CRDY * ENBGT * 


/BGT 






+ RESET * TR4 








KWEND. CLKF - CLK 








KWEND. RSTF = GND 








KWEND. SETF = GND 








'/KWEND. TRST - OE 








/C5BGT.D := /RESET * /BGT * KWEND 








+ /RESET * /CRDY * /PBGT 








+ /RESET * YSBGT * CRDY * 


/BGT 






+ /RESET * CRDY * ENBGT * 


/BGT 






+ /RESET * /YSBGT * /CRDY 


* /ENBGT 


* /BGT 




+ /RESET * /YSBGT * /ENBGT * C5BGT 


* KWEND * PBGT 




+ RESET * /C8LDRV 








C5BGT.CLKF - CLK 








C5BGT.RSTF - GND 








C5BGT.SETF - GND 








/C5BGT.TRST - OE 








/BGT.D := /RESET * /BGT * KWEND 








+ /RESET * /CRDY * /PBGT 








+ /RESET * YSBGT * CRDY * /BGT 






+ /RESET * CRDY * ENBGT * /BGT 






+ /RESET * /YSBGT * /CRDY * 


/ENBGT * 


/BGT 




+ /RESET * /YSBGT * /ENBGT * 


v C5BGT * 


KWEND * PBGT 




BGT. CLKF - CLK 








BGT. RSTF - GND 








BGT. SETF - GND 








/BGT. TRST - OE 
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TITLE EBRDY 
PATTERN A 
REVISION 2.0 
AUTHOR 
COMPANY 
DATE 



Declaration Segment 



ISIC SILAS 

INTEL 

2/4/91 



CHIP xOl 85C224 



This PLD contains the XBRDY, XMSWNDO, and XCTRCK state machines. 



Pin Declarations 



PIN 


1 


CLK 


PIN 


2 


RESET 


PIN 


3 


CLEN1 


PIN 


4 


WSDTS 


PIN 


5 


YSCEOC 


PIN 


6 


SNPCYC 


PIN 


7 


MSNPSTB 


PIN 


8 


CKENLC 


PIN 


9 


MKEN 


PIN 


13 


OEx 


PIN 


15 


CKEN 


PIN 


17 


SV 


PIN 


18 


PNDCEOC 


PIN 


19 


ENBRDY 


PIN 


20 


BRDY 


PIN 


21 


BRDY1 


PIN 


22 


MSWNDO 



/CKEN = CKENLC * /MKEN 
+ /CKENLC * /CKEN 
+ /MKEN * /CKEN 

CKEN.TRST - VCC 

/SV.D := /RESET * BRDY * /SV 
+ /RESET * CLEN1 * /SV 





+ 


/RESET 


* 


/YSCEOC 


-k 


BRDY * ENBRDY * /PNDCEOC 




+ 


/RESET 


* 


/YSCEOC 


* 


CLEN1 


* ENBRDY 


* /PNDCEOC 


SV 


CLKF = 


= CLK 














SV. 


RSTF - 


- GND 














SV 


SETF = 


= GND 














/s\ 


'.TRST 


= OEx 















/PNDCEOC. D := /RESET 
+ /RESET 
+ /RESET 
+ /RESET 
+ /RESET 
PNDCEOC. CLKF - CLK 
PNDCEOC. RSTF = GND 
PNDCEOC. SETF - GND 
/PNDCEOC. TRST = OEx 



/SV 

BRDY * /PNDCEOC 

CLEN1 * /PNDCEOC 



/YSCEOC 
/YSCEOC 



ENBRDY * /PNDCEOC 
/ENBRDY * PNDCEOC 



/ENBRDY. D := YSCEOC * PNDCEOC 
+ /ENBRDY * PNDCEOC 
+ YSCEOC * /BRDY * /CLEN1 
+ /YSCEOC * BRDY * /ENBRDY 
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+ /YSCEOC * CLENl * /ENBRDY 

+ /BRDY * /CLEN1 * ENBRDY * /PNDCEOC 

+ RESET 




ENBRDY. CLKF - CLK 






ENBRDY. RSTF = GND 






ENBRDY. SETF - GND 






/ENBRDY. TRST - OEx 






/BRDY.D :- /RESET * 
+ /RESET * 
+ /RESET * 

BRDY.CLKF - CLK 


' CLEN1 * /BRDY 

' /PNDCEOC * /WSDTS * BRDY 

f /YSCEOC * /ENBRDY * /WSDTS * BRDY 




BRDY. RSTF - GND 






BRDY. SETF - GND 






/BRDY. TRST - OEx 






/BRDYl.D :- /RESET 
+ /RESET 
+ /RESET 

BRDYl.CLKF - CLK 


* CLENl * /BRDY1 

* /PNDCEOC * /WSDTS * BRDY1 

* /YSCEOC *. /ENBRDY * /WSDTS * BRDY1 




BRDY1.RSTF - GND 






BRDY1.SETF - GND 






/BRDY1.TRST - OEx 






/MSWNDO - /RESET * 
+ /RESET * 
MSWNDO. TRST - VCC 


/SNPCYC 

MSNPSTB * /MSWNDO 
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TITLE ECYCDEF 
PATTERN A 
REVISION 2.0 
AUTHOR ISIC SILAS 
COMPANY INTEL 
DATE 2/5/91 



Declaration Segment 



CHIP xOl 85C224 





This PLD contains the YMEM 


PIN 


1 


YMALE 


PIN 


2 


YMAOE 


PIN 


3 


MHLDA 


PIN 


4 


KCACHE 


PIN 


5 


CWR 


PIN 


6 


CMIO 


PIN 


7 


CDC 


PIN 


8 


LLEN 


PIN 


9 


C5MTHIT 


PIN 


10 


YSNPDIS 


PIN 


11 


NCI 


PIN 


13 


NC2 


PIN 


14 


YNOSWNDI 


PIN 


23 


YWRI 


PIN 


15 


YWR 


PIN 


16 


MTHIT 


PIN 


17 


MLEN 


PIN 


18 


MDC 


PIN 


19 


MMIO 


PIN 


20 


MWR 


PIN 


21 


MCACHE 


PIN 


22 


YNOSWND 


EQUATIONS 




/YWR 


= YMALE * /CWR 




+ /YMALE * /YWRI 




+ /CWR 


* /YWRI 


YWR. 


TRST = 


vcc 



Pin Declarations 



/MTHIT = /C5MTHIT * /MWR * MHLDA 
MTHIT. TRST = MHLDA 

/MLEN = YMALE * /LLEN * /YMAOE 
+ /YMALE * /MLEN * /YMAOE 
+ /LLEN * /MLEN * /YMAOE 

MLEN. TRST - /YMAOE 

/MDC = YMALE * /CDC * /YMAOE 
+ /YMALE * /MDC * /YMAOE 
■+ /CDC * /MDC * /YMAOE 
MDC. TRST = /YMAOE 

/MMIO = YMALE * /CMIO * /YMAOE 
+ /YMALE * /MMIO * /YMAOE 
+ /CMIO * /MMIO * /YMAOE 

MMIO. TRST = /YMAOE 

/MWR = YMALE * /CWR * /YMAOE 
+ /YMALE * /MWR * /YMAOE 
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+ /CWR * /MWR * /YMAOE 
MWR. TRST - /YMAOE 

/MCACHE - YMALE * /KCACHE * /YMAOE 

+ /YMALE * /MCACHE * /YMAOE 

+ /KCACHE * /MCACHE * /YMAOE 
MCACHE. TRST - /YMAOE 

/YNOSWND - YMALE * /YSNPDIS 

+ YMALE * /CMIO 

+ YMALE * CWR * /KCACHE 

+ /YMALE * /YNOSWNDI 

+ /YSNPDIS * /YNOSWNDI 

+ /CMIO * /YNOSWNDI . . , 

+ CWR * /KCACHE * /YNOSWNDI 
YNOSWND. TRST = VCC 
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TITLE 


Declaration Segment 

EMBE 


PATTERN A 




REVISION 2.0 




AUTHOR ISIC SILAS 




COMPANY INTEL 




DATE 


2/5/91 




CHIP 


xOl 85C224 

This PLD generates the memory bus byte enables (MBEs) 




PIN 


Pin Declarations- 

1 LBEO 




PIN 


2 LBE1 




PIN 


3 LBE2 




PIN 


4 LBE3 




PIN 


5 LBE4 




PIN 


6 LBE5 




PIN 


7 LBE6 




PIN 


8 LBE7 




PIN 


9 RDYSRC 




PIN 


10 KCACHE 




PIN 


11 YMALE 




PIN 


13 YMAOE 




PIN 


14 MBE6I 




PIN 


23 MBE7I 




PIN 


15 MBE7 




PIN 


16 MBE5 




PIN 


17 MBE4 




PIN 


18 MBE3 




PIN 


19 MBE2 




PIN 


20 MBE1 




PIN 


21 MBEO 




PIN 


22 MBE6 




EQUATIONS 




/MBE7 
MBE7. 


= /YMALE * /LBE7 * /YMAOE 
+ /YMALE * /KCACHE * /RDYSRC * /YMAOE 
+ YMALE * /MBE7I * /YMAOE 
+ /LBE7 * /MBE7I * /YMAOE 
+ /KCACHE * /RDYSRC * /MBE7I * /YMAOE 
TRST = /YMAOE 




/MBE5 
MBE5. 


= /YMALE * /LBE5 * /YMAOE 
+ /YMALE * /KCACHE * /RDYSRC .* /YMAOE 
+ YMALE * /MBE5 * /YMAOE 
+ /LBE5 * /MBE5 * /YMAOE 
+ /KCACHE * /RDYSRC * /MBE5 * /YMAOE 
TRST = /YMAOE 




/MBE4 
MBE4. 


= /YMALE * /LBE4 * /YMAOE 
+ /YMALE * /KCACHE * /RDYSRC * /YMAOE 
+ YMALE * /MBE4 * /YMAOE 
+ /LBE4 * /MBE4 * /YMAOE 
+ /KCACHE * /RDYSRC * /MBE4 * /YMAOE 
TRST = /YMAOE 




/MBE3 


= /YMALE * /LBE3 * /YMAOE 

+ /YMALE * /KCACHE * /RDYSRC * /YMAOE 

+ YMALE * /MBE3 *' /YMAOE 
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+ /LBE3 * /MBE3 * /YMAOE 
+ /KCACHE * /RDYSRC * /MBE3 * /YMAOE 
MBE3.TRST '- /YMAOE 

/MBE2 = /YMALE * /LBE2 * /YMAOE 

+ /YMALE * /KCACHE * /RDYSRC * /YMAOE 

+ YMALE * /MBE2 * /YMAOE 

+ /LBE2 * /MBE2 * /YMAOE 

+ /KCACHE * /RDYSRC * /MBE2 * /YMAOE 
MBE2.TRST - /YMAOE 

/MBE1 - /YMALE * /LBE1 * /YMAOE 

+ /YMALE * /KCACHE * /RDYSRC * /YMAOE 

+ YMALE * /MBE1 * /YMAOE 

+ /LBE1 * /MBE1 * /YMAOE 

+ /KCACHE * /RDYSRC * /MBE1 * /YMAOE 
MBE1 . TRST = /YMAOE 

/MBEO = /YMALE * /LBEO * /YMAOE 

+ /YMALE * /KCACHE * /RDYSRC * /YMAOE 

+ YMALE * /MBEO * /YMAOE 

+ /LBEO * /MBEO * /YMAOE 

+ /KCACHE * /RDYSRC * /MBEO * /YMAOE 
MBEO . TRST - /YMAOE , 

/MBE6 - /YMALE * /LBE6 * /YMAOE 

+ /YMALE * /KCACHE * /RDYSRC * /YMAOE 

+ YMALE * /MBE6I * /YMAOE 

+ /LBE6 * /MBE6I * /YMAOE 

+ /KCACHE * /RDYSRC * /MBE6I * /YMAOE 
MBE6.TRST = /YMAOE 
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TITLE EMEMALE 
PATTERN A 
REVISION 2.0 
AUTHOR ISIC SILAS 
COMPANY INTEL 
DATE 2/4/91 



Declaration Segment 



CHIP xOl 85C224 



This PLD contains the YMALE, YMBRDY, YWMNA, 
and YIMSWND state machines. 



Pin Declarations 



PIN 


1 


MCLK 


PIN 


2 


MRESET 


PIN 


3 


YBGT 


PIN 


4 


YNOPIPE 


PIN 


5 


YPIPE 


PIN 


6 


MNA 


PIN 


7 


WMSWND 


PIN 


8 


YMEOC 


PIN 


9 


PXSAS 


PIN 


10 


YALLOC 


PIN 


11 


MSWNDI 


PIN 


13 


OE 


PIN 


14 


MBRDY 


PIN 


23 


YMADS 


PIN 


15 


YIMSWND 


PIN 


16 


YDRCTM 


PIN 


17 


NCI 


PIN 


18 


YMALE 


PIN 


19 


DISWND 


PIN 


20 


WMNA 


PIN 


21 


YMBRDY 


PIN 


22 


NC2 


EQUATIONS 





/YIMSWND = /MSWNDI * YALLOC * DISWND 
YIMSWND. TRST = VCC 



/YDRCTM. D := /MRESET 
+ /MRESET 
YDRCTM. CLKF = MCLK 
YDRCTM. RSTF - GND 
YDRCTM. SETF = GND 
/YDRCTM. TRST = OE 



/DISWND 
YMEOC * YPIPE 



* /YDRCTM 



NCI 


D : = 


VCC 




NCI' 


CLKF 


= MCLK 




NCI 


RSTF 


= GND 




NCI 


SETF 


= GND 




/NC] 


..TRST 


- OE 




/YMALE. D 


:= /MRESET 
+ /MRESET 
+ /MRESET 
+ /MRESET 
+ /MRESET 
+ /MRESET 


* 

* 
* 

•k 



/YPIPE * /YMALE 

/YBGT * YMALE 

YNOPIPE * YMEOC * /YMALE 

WMSWND * YMEOC * /YMALE 

/YMADS * WMNA * YMEOC * /YMALE 

MNA * WMNA * YMEOC * /YMALE 
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YMALE.CLKF - MCLK 






YMALE.RSTF - GND 






YMALE.SETF = Grit) 






/YMALE.TRST > OE 






/DISWND.D :- /MRESET 


* /DISWND * YDRCTM 




+ /MRESET 


* /PXSAS * /YALLOC * YDRCTM 




DISWND.CLKF - MCLK 






DISWND.RSTF - GND 






DISWND.SETF - GND 






/DISWND.TRST .- OE 






/wmna.d :- /Preset * 


/YNOPIPE * YMADS * YMEOC * /WMNA 




+ /MRESET * 


/YNOPIPE * YMADS * /MNA * WMNA 




+ /MRESET * 


/YPIPE * YMADS * /MNA * /YMEOC * WMNA 




WMNA.CLKF - MCLK 






WMNA.RSTF - GND 






WMNA.SETF - GND 






/WMNA.TRST - OE 






/YMBRDY.D :- /MRESET 


* /MBRDY 




YMBRDY.CLKF - MCLK 






YMBRDY.RSTF - GND 






YMBRDY.SETF -GND 






/YMBRDY.TRST - OE 






NC2 - VCC 






NC2.TRST - VCC 
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TITLE EMSNPST 


PATTERN 


A 


REVISION 


2.0 


AUTHOR 


ISIC SILAS 


COMPANY 


INTEL 


DATE 


2/4/91 


CHIP xOl 85C224 



Declaration Segment 



This PLD contains the YALLC , 
and YMBREQ state machines. 



YMEMLOCK, YSNPSTB, 



-Pin Declarations- 



PIN 


1 


MCLK 


PIN 


2 


MRESET 


PIN 


3 


MKEN 


PIN 


4 


MHLDA 


PIN 


5 


YWR 


PIN 


6 


YNOSWND 


PIN 


7 


YBGT 


PIN 


8 


XLRDYSRC 


PIN 


9 


RFO 


PIN 


10 


SNPDIS 


PIN 


11 


PALLC 


PIN 


13 


KLOCK 


PIN 


23 


PXSAS 


PIN 


16 


YMLOCK 


PIN 


17 


SV2 


PIN 


18 


HBASWB 


PIN 
PIN 


19 
20 


MBREQ 
MSNPSTB 


PIN 


21 


SV1 


PIN 


22 


YALLOC 


EQUATIONS 




/YMLOCK.D : 
YMLOCK.CLKF 


= /MRESET * 
+ /MRESET * 
+ /MRESET * 
+ /MRESET * 
4- /MRESET * 
= MCLK 


YMLOCK 


.RSTF 


= GND 


YMLOCK 


.SETF 


- GND 


YMLOCK 


.TRST 


= vcc 



/SV2.D 



/YALLOC * YMLOCK 
/HBASWB * YMLOCK 
YBGT * /YALLOC * 
/KLOCK * /YALLOC 
/YBGT * /KLOCK * 



:= PXSAS * /SV2 
+ PXSAS * /MHLDA * /HBASWB 



/HBASWB 
* /HBASWB 
YMLOCK 



SV2.CLKF - MCLK 
SV2.RSTF - GND 
SV2.SETF - GND 
SV2.TRST = VCC 

/HBASWB. D := /MRESET * PXSAS * /HBASWB * SV2 

+ /MRESET * PXSAS * /YBGT * MBREQ * 
HBASWB. CLKF - MCLK 
HBASWB. RSTF = GND 
HBASWB. SETF = GND 
HBASWB. TRST - VCC 

/MBREQ. D :- PXSAS * /MBREQ 

+ PXSAS * HBASWB * /SV2 



SV2 
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+ /PXSAS * /YBGT * MBREQ * HBASWB * SV2 




+ MRESET 






MBREQ . CLKF .- MCLK 






MBREQ. RSTF = GND 






MBREQ. SETF = GND 






MBREQ. TRST = VCC 






/MSNPSTB. D := /MRESET 


* /YBGT * YNOSWND * YWR * MSNPSTB 




+ /MRESET 


* /YBGT * YNOSWND * XLRDYSRC * MSNPSTB 




+ /MRESET 


* /YBGT * YNOSWND * RFO * MSNPSTB 




MSNPSTB.CLKF - MCLK 






MSNPSTB.RSTF - GND 






MSNPSTB.SETF = GND 






MSNPSTB.TRST - VCC 






/SV1.D .:- /YALLOC 






SV1.CLKF - MCLK 






SV1.RSTF = GND 






SV1.SETF = GND 






SVl. TRST - VCC 






/YALLOC. D := /MRESET * 


• PXSAS * /YALLOC * /SVl 




+ /MRESET * 


- /MKEN * /YALLOC * SVl 




+ /MRESET * 


/YBGT * /PALLC * SNPDIS * /RFO * YALLOC 




YALLOC. CLKF = MCLK 






YALLOC. RSTF - GND 






YALLOC. SETF - GND 






YALLOC. TRST - VCC 
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TITLE EMZBT 
PATTERN A 
REVISION 3.1 
AUTHOR 
COMPANY 
DATE 



Declaration Segment 



ISIC SILAS + Andy Bloom 

INTEL 

2/7/91 



CHIP xOl 85C224 



This PLD contains the YMBRDY state machine. 



Pin Declarations 



PIN 


1 


MCLK 


PIN 


2 


MRESET 


PIN 


3 


MAOE 


PIN 


4 


MHLDA 


PIN 


5 


YNOPIPE 


PIN 


6 


YPIPE 


PIN 


7 


MCACHE 


PIN 


8 


YMEOC 


PIN 


9 


MEMZBTEN 


PIN 


10 


SYNC 


PIN 


11 


MALDRV 


PIN 


13 


FLUSH 


PIN 


14 


NCPFLD 


PIN 


15 


FPFLDEN 


PIN 


23 


NC4 


PIN 


16 


NCI 


PIN 


17 


NC2 


PIN 


18 


NC3 


PIN 


19 


YMZBT 


PIN 


20 


FPFLD 


PIN 


21 


YFLUSH 


PIN 


22 


YSYNC 


EQUATIONS 




NCI = 


VCC 




NC1.TRST = 


VCC 


NC2 = 


VCC 




NC2.TRST = 


VCC 


NC3 = 


VCC 




NC3.TRST = 


VCC 


/YMZBT.D := 


/MRESET * 




+ 


/MRESET * 




+ 


/MRESET * 




+ 


/MRESET * 




+ 


/MRESET * 




+ 


/MRESET * 




+ 


MRESET 


YMZBT 


CLKF 


= MCLK 


YMZBT 


RSTF 


= GND 


YMZBT 


SETF 


= GND 


YMZBT 


TRST 


- VCC 



YPIPE * YMEOC * /YMZBT 

/MCACHE * YMEOC* /YMZBT 

YNOPIPE * /YPIPE * /MCACHE * /MEMZBTEN 

YNOPIPE * /YPIPE * /MCACHE * /YMZBT 

MHLDA * /MAOE * /MEMZBTEN * YMZBT 

/YNOPIPE * /MCACHE * /MEMZBTEN * YMZBT 
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/FPFLD - FPFLDEN * MRESET 
FPFLD.TRST - MRESET 

/YFLUSH *. MRESET * /NCPFLD 
+ /MRESET * /FLUSH 
YFLUSH. TRST - VCC 

/YSYNC - MRESET * /MALDRV 

+ /MRESET * /SYNC 
YSYNC. TRST - VCC 
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TITLE ESIGGEN 

PATTERN 

REVISION 1.0 

AUTHOR 

COMPANY INTEL 

DATE 

CHIP INTEL 85C224 



-Declaration Segment - 



This PLD drives memory bus and core signals based on the states 
of other state machines 



PIN 


1 


YDRCTM 


PIN 


2 


YMADS 


PIN 


3 


YMAOE 


PIN 


4 


MHLDA 


PIN 


5 


NCI 


PIN 


6 


MWBWT 


PIN 


7 


MDRCTM 


PIN 


8 


SNPDIS 


PIN 


9 


UNI 


PIN 


10 


YMSEL 


PIN 


11 


TR4 


PIN 


13 


YMFRZ 


PIN 


14 


MDLDRV 


PIN 


23 


LMRST 


PIN 


15 


C8MSEL 


PIN 


16 


NC2 


PIN 


17 


CDRCTM 


PIN 


18 


CWBWT 


PIN 


19 


MBOFF 


PIN 


20 


MADS 


PIN 


21 


YSNPDIS 


PIN 


22 


C8MFRZ 


EQUATIONS 




/C8MSEL = 


LMRST * /TR4 




+ 


/LMRST * /YMSEL 


C8MSEL.TRST = VCC 


NC2 


= VCC 




NC2. 


TRST = 


- VCC 



Pin Declarations - 



CDRCTM = MDRCTM * YDRCTM 
CDRCTM. TRST = VCC 

/CWBWT - /MWBWT * YDRCTM 
CWBWT. TRST = VCC 

/MBOFF = YMAOE * /MHLDA 
MBOFF. TRST - /MHLDA 

/MADS = /YMADS * /YMAOE 
MADS. TRST = /YMAOE 

YSNPDIS = SNPDIS * UNI 
YSNPDIS. TRST = VCC 

/C8MFRZ = LMRST * /MDLDRV 
+ /LMRST * /YMFRZ 
C8MFRZ.TRST = VCC 
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— Declaration Segment- — - — 


TITLE ESWND 




PATTERN A 




REVISION 2.0 




AUTHOR ISIC SILAS 




COMPANY INTEL 




DATE 2/4/91 




CHIP xOl 85C224 






This PLD contains the XCRDY, XSWND, and XENSWND state machines. 






TJA-n n«« 1 -.-v-^ ■»- ■! n<nr* 




, jlj.11 i^c^j.cxj.a.i.x^'iio ■ 

PIN 1 CLK 




PIN 2 RESET 




PIN 3 WSDTS 




PIN 4 BGT 




PIN 5 PBGT 




PIN 6 TR4 




PIN 7 YSMSWND 




PIN 8 SNPDIS 




PIN 9 YSMEOC 




PIN 10 SLFTST 




PIN 13 OEx 




PIN 16 ENSWND 




PIN 17 SV3 




PIN 18 SWEND 




PIN 19 SV2 




PIN 20 SV1 




PIN 21 CRDY 




PIN 22 CRDY1 




EQUATIONS 




ENSWND. D := /RESET * /YSMSWND 




ENSWND. CLKF - CLK 




ENSWND . RSTF - GND 




ENSWND. SETF - GND 




/ENSWND. TRST = OEx 




/SV3.D := RESET * TR4 




+ /RESET * CRDY * SWEND * /SV3 




+ /RESET * /PBGT * /ENSWND * /YSMSWND * CRDY * SV3 




+ /RESET * /PBGT * CRDY * /SNPDIS * /SWEND * SV3 




+ /RESET * /PBGT * /ENSWND * /YSMSWND * SWEND * SV3 




SV3.CLKF - CLK 




SV3.RSTF - GND 




SV3.SETF - GND 




/SV3.TRST - OEx 




/SWEND. D :- RESET * TR4 




+ /RESET * /CRDY * SWEND * /SV3 




+ /RESET * PBGT * CRDY * /SWEND * SV3 




+ /RESET * /BGT * /SNPDIS * SWEND * SV3 




+ /RESET * ENSWND * CRDY * SNPDIS * /SWEND * SV3 




+ /RESET * YSMSWND * CRDY * SNPDIS * /SWEND * SV3 




+ /RESET * /BGT * /ENSWND * /YSMSWND * SWEND * SV3 




+ /RESET * /PBGT * /ENSWND * /YSMSWND * /CRDY * /SWEND * SV3 




SWEND. CLKF - CLK 




SWEND. RSTF - GND • • 




SWEND. SETF = GND 
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/SWEND.TRST = OEx 




/SV2.D := RESET * /SLFTST 




+ /RESET * /YSMEOC * CRDY * /SV2 




+ /RESET * /YSMEOC * /CRDY * SV2 




SV2.CLKF = CLK 




SV2.RSTF = GND 




SV2.SETF = GND 




/SV2.TRST - OEx 




/SV1.D := /RESET * /YSMEbC * CRDY 




+ /RESET * CRDY * /SV1 * SV2 




SV1.CLKF - CLK 




SV1.RSTF = GND 




SV1.SETF = GND 




/SV1.TRST = OEx 




/CRDY.D := RESET * /SLFTST 




+ /RESET * /YSMEOC * /WSDTS * CRDY * SV2 




+ /RESET * /WSDTS * CRDY * /SV1 * SV2 




CRDY.CLKF - CLK 




CRDY.RSTF = GND 




CRDY.SETF = GND 




/CRDY.TRST = OEx 




/CRDY1.D : = RESET * /SLFTST 




+ /RESET * /YSMEOC * /WSDTS * CRDY1 * SV2 




+ /RESET * /WSDTS * CRDY1 * /SV1 * SV2 




CRDY1.CLKF = CLK 




CRDY1.RSTF = GND 




CRDY1.SETF = GND 




/CRDY1.TRST = OEx 
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- — -Declaration Segment- - — — 


TITLE 


EWCPLB 




PATTERN A 






REVISION 2. 







AUTHOF 


ISIC SILAS 




COMPANY INTEL 




DATE 


2/4/91 




CHIP 


xOl 85C224 

This PLD contains the XWCPLB and YCPULEN state machines. 








...................Pin Horl arafi nn. ............. 




PIN 


1 


... ITJLll UcLlaLaLJ.UH* ---------------------- 

CLK 




PIN 


2 


RESET 




PIN 


3 


CRDY 




PIN 


4 


RDYSRC 




PIN 


5 


BGT 




PIN 


6 


PBGT 




PIN 


7 


KCACHE 




. PIN 


8 


LEN 




PIN 


9 


CACHE 




PIN 


10 


CKEN 




PIN 


11 


BRDY 




PIN 


13 


OEx 




PIN 


16 


CLEN4 




PIN 


17 


CLEN2 




PIN 


18 


CLEN1 




PIN 


19 


LKCACHE 




PIN 


20 


SV 




PIN 


21 


CPUEN 




PIN 


22 


WCPLB, 




EQUATIONS 
/CLEN4.D := 






CPUEN * /CLEN2 * /CLEN4 




+ 


/BRDY * /CLEN1 






+ 


BRDY * CLEN2 * /CLEN4 






+ 


/CACHE * /CKEN * /LKCACHE * /CLEN2 * /CLEN4 






+ 


RESET 




CLEN4 . 


CLKF = 


= CLK 




CLEN4. 


RSTF 


= GND 




CLEN4 . 


SETF • 


= GND 




/CLEN4 


.TRST 


- OEx 




/CLEN2 


.D : = 


CPUEN * /CLEN2 * /CLEN4 






+ 


BRDY * /CLEN2 * CLEN4 






+ 


/BRDY * CLEN2 * CLEN4 






+ 


LEN * CACHE * /CLEN2 * /CLEN4 






+ 


LEN * CKEN * /CLEN2 * /CLEN4 






+ 


LEN * LKCACHE * /CLEN2 * /CLEN4 






+ 


RESET 




CLEN2 . 


CLKF - 


= CLK 




CLEN2 . 


RSTF ■ 


= GND 




CLEN2 . 


SETF ■ 


- GND 




/CLEN2 


.TRST 


- OEx 




/CLEN1 


.D :- 


/RESET * BRDY * /CLENl 






+ 


/RESET * /BRDY * /CLEN2 * CLEN4 






+ 


/RESET * /CPUEN.* /LEN * CACHE * /CLEN2 * /CLEN4 






+ 


/RESET * /CPUEN * /LEN * CKEN * /CLEN2 * /CLEN4 






+ 


/RESET * /CPUEN * /LEN * LKCACHE * /CLEN2 * /CLEN4 




CLEN1 . 


CLKF ■ 


= CLK 
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CLEN1.RSTF = GND 
CLEN1.SETF = GND 
/CLEN1.TRST = OEx 

/LKCACHE.D := /KCACHE 
LKCACHE.CLKF = CLK 
LKCACHE.RSTF = GND 
LKCACHE.SETF = GND 
/LKCACHE.TRST = OEx 



/SV.D := /RESET * 
+ /RESET * 
+ /RESET * 

SV.CLKF = CLK 

SV.RSTF = GND 

SV.SETF = GND 

/SV.TRST = OEx 

/CPUEN. D := /RESET 
+ /RESET 
+ /RESET 
+ /RESET 
CPUEN.CLKF = CLK 
CPUEN.RSTF = GND 
CPUEN.SETF = GND 
/CPUEN.TRST = OEx 



CRDY * /SV 

/RDYSRC * /BGT * CPUEN * SV 

CRDY * /BRDY * /CLEN1 * WCPLB * /CPUEN 



BRDY * /CPUEN 

CLEN1 * /CPUEN 

RDYSRC * /BGT * /WCPLB 

RDYSRC * /BGT * CPUEN * SV 



/WCPLB. D := /RESET * BRDY * /WCPLB 

* /WCPLB 

* BRDY * /CPUEN 

* CLEN1 * /CPUEN 



+ /RESET * CLEN1 



+ /RESET 
+ /RESET 
WCPLB. CLKF = CLK 
WCPLB. RSTF = GND 
WCPLB. SETF = GND 
/WCPLB. TRST = OEx 



/CRDY 
/CRDY 
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MCLK 
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PIN 
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PIN 
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YNOPIPE 


PIN 
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YMEOC 
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MHITMI 


PIN 
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YMSWEND 


PIN 


10 


YNOSWND 


PIN 


11 


YBGT 


PIN 


13 


OEx 


PIN 


14 


YALLOC 


PIN 


23 


PCTCXFR 


PIN 


15 


UNUSED 


PIN 


16 


PSWBAS 


PIN 


17 


SV 


PIN 


18 


WMSWND 


PIN 


19 


ENMSWND 


PIN 


20 


ENXSAS 


PIN 


21 


PXSAS 


PIN 


22 


YSWEHITM 


EQUATIONS 




UNUSED 


= VCC 


UNUSED 


.TRST 


= VCC 



Pin Declarations - 



/PSWBAS = /XSAS * /XSNPWB * /ENXSAS 
PSWBAS. TRST = VCC 



/SV.D := YMEOC * /WMSWND * /SV 



+ /YNOSWND * /YPIPE 
+ /YPIPE * /YMSWEND 



YMEOC * /WMSWND 
/ENMSWND * YMEOC 



* /WMSWND 



SV.CLKF - 


MCLK 


SV.RSTF = 


GND 


SV.SETF = 


GND 


/SV.TRST • 


= OEx 


/WMSWND . D 


:- /MRESET * YMEOC * t 




+ /MRESET * /WMSWND • 




+ /MRESET * /YNOSWND 




+ /MRESET * /YNOSWND 




+ /MRESET * /YPIPE * 




+ /MRESET * YPIPE * 




+ /MRESET * /YMSWEND 




+ /MRESET * YNOSWND 



WMSWND. CLKF = MCLK 
WMSWND. RSTF - GND 



/WMSWND 
* /SV 

* /YPIPE * /WMSWND 

* /YBGT * WMSWND 
/YMSWEND * /ENMSWND * /WMSWND 

/PCTCXFR * /YALLOC * /WMSWND 
/ENMSWND * /YALLOC * WMSWND 
/YNOPIPE .* /YMSWEND * /ENMSWND * WMSWND 
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WMSWND.SETF = GND 
/WMSWND.TRST = OEx 

/ENMSWND.D := YMSWEND 
ENMSWND.CLKF = MCLK 
ENMSWND.RSTF = GND 
ENMSWND.SETF = GND 
/ENMSWND.TRST = OEx 

/ENXSAS.D := YBGT * /ENXSAS 

+ XSAS * ENXSAS 

+ MRESET 
ENXSAS. CLKF = MCLK 
ENXSAS. RSTF - GND 
ENXSAS. SETF = GND 
/ENXSAS. TRST = OEx 

/PXSAS = /XSAS * XSNPWB * /ENXSAS 
PXSAS.TRST = VCC 

/YSWEHITM = /YMSWEND * /ENMSWND * /MHITMI * YALLOC 

+ /YMSWEND * /ENMSWND * /MHITMI * YNOPIPE * YPIPE 
YSWEHITM. TRST = VCC 
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80960SA/80960SB 

EMBEDDED 32-BBT PROCESSORS 

WITH 16-BIT BURST DATA BUS 



m High-Performance Embedded 
Architecture 
— 16 MIPS Burst Execution at 16 MHz 

— 5 MIPS* Sustained Execution at 
16 MHz 

El 512-Byte On-Chip Instruction Cache 

— Direct Mapped 

— Parallel Load/Decode for Uncached 
Instructions 

□ Multiple Register Sets 

— Sixteen Global 32-Bit Registers 

— Sixteen Local 32-Bit Registers 

— Four Local Register Sets Stored 
On-Chip 

— Register Scoreboarding 

m Software Compatible with 
80960KA/KB/CA Processors 



□ Built-in Interrupt Controller 

— 4 Direct Interrupt Pins 

— 32 Priority Levels 256 Vectors 

El Built-in Floating Point Unit 
(80960SB only) 

— Fully IEEE 754 Compatible 

m Easy to Use, High Bandwidth 16-Bit Bus 

— 25.6 Mbyte/sec Burst 

— Up to 16 Bytes Transferred per Burst 

El 32-Bit Address Space, 4 Gigabytes 

□ 80-Lead Quad Flat Pack (EIAJ QFP) 

m 84-Lead Plastic Leaded Chip Carrier 
(PLCC) 




The 80960SA and 80960SB are members of Intel's i960 32-bit processor family, which are designed especially 
for low cost embedded applications. They are based on the family's high performance, common core architec- 
ture, and include a 512-byte instruction cache and a built-in interrupt controller. The 80960SA and 80960SB 
have a large register set, multiple parallel execution units and a high bandwidth, 16-bit, burst bus. Using 
advanced RISC technology, these high performance processors are capable of execution rates in excess of 
5 million instructions per second.* The 80960SA and 80960SB are well-suited for a wide range of cost 
sensitive embedded applications such as laser printers, EISA and MCA adapters, disk controllers and X 
Terminals. 

♦Relative to Digital Equipment Corporation's VAX-11/780** at MIPS 
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**VAX-11tm is a trademark of Digital Equipment Corporation. 
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THE I960TM PROCESSOR SERIES 

The 80960SA and 80960SB are members of a new 
family of 32-bit microprocessors from Intel known as 
the i960 Series. This series was especially designed 
to serve the needs of embedded applications. The 
embedded market includes applications as diverse 
as industrial automation, avionics, image processing, 
graphics, robotics, telecommunications and automo- 
biles. These types of applications require high inte- 
gration, low power consumption, quick interrupt re- 
sponse times and high performance. Since time to 
market is critical, embedded microprocessors need 
to be easy to use in both hardware and software 
designs. 



All members of the 80960 series share a common 
core architecture which utilizes RISC technology so 
that, except for special functions, the family mem- 
bers are object code compatible. Each new proces- 
sor in the series will add its own special set of func- 
tions to the core to satisfy the needs of a specific 
application or range of applications for the embed- 
ded market. For example, future processors may in- 
clude a DMA controller, a timer or an A/D converter. 

Software written for the 80960SA and 80960SB will 
run without modification on any other member of the 
80960 family. The 80960SA is pin compatible with 
the 80960SB, which includes an integrated floating- 
point unit. 
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NOTES: 

1 . Register g1 5 is reserved for stack management functions. 

2. Floating-Point registers and operations are available only in the 960SB and 960KB processors. 

3. Registers r0, r1 and r2 are reserved for stack management functions. 

4. Register g14 is used by BAL and BALX instructions. 



Figure 2. 80960 Register Set 
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Key Performance Features 

The 80960SA and 80960SB's architecture is based 
on the most recent advances in RISC technology 
and is grounded in Intel's long experience in design- 
ing embedded controllers. Many features contribute 
to the 80960SA and 80960SB exceptional perform- 



1. Large Register Set. Modern compilers can take 
advantage of a large number of registers to optimize 
execution speed. For maximum flexibility, the 
80960SA and 80960SB provide 32 32-bit registers 
and four 80-bit floating-point registers. (See 
Figure 2.) 

2. Fast Instruction Execution. Simple functions 
make up the bulk of instructions in most programs, 
so that execution speed can be greatly improved by 
ensuring that these core instructions execute in as 
short a time as possible. The most-frequently exe- 
cuted instructions such as register-register moves, 
add/subtract, logical operations, and shifts execute 
in one to two cycles (Table 1 contains a list of in- 
structions). 

3. Load/Store Architecture. Like other processors 
based on RISC technology, the 80960SA and 



80960SB has a Load/Store architecture. Only the 
LOAD and STORE instructions reference memory; 
all other instructions operate on registers. This type 
of architecture simplifies instruction decoding and is 
used in combination with other techniques to. in- 
crease parallelism. 

4. Simple Instruction Formats. All instructions in 
the 80960SA and 80960SB are 32 bits long and 
must be aligned on word boundaries. This alignment 
makes it possible to eliminate the instruction-align- 
ment stage in the pipeline. To simplify the instruction 
decoder further, there are only five instruction for- 
mats and each instruction type uses only one for- 
mat. (See Figure 3.) 

5. Overlapped Instruction Execution. A load oper- 
ation allows execution of subsequent instructions to 
continue before the data has been returned from 
memory, so that these instructions can overlap the 
load. The 80960SA and 80960SB manage this pro- 
cess transparently to software through the use of a 
register scoreboard. Conditional instructions also 
make use of a scoreboard so that subsequent unre- 
lated instructions can be executed while the condi- 
tional instruction is pending. 
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Figure 3. Instruction Formats 
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Table 1. 80960SA and 80960SB Instruction Set 
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And Not 


Not Bit 
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Divide 
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Check Bit 




Remainder 
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Alter Bit 
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Scan for Bit 




Shift 
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Scan over Bit 




Extended Multiply 
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Comparison 
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(80960SB only) 


Floating-Point 
(80960SB only) 
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6. Integer Execution Optimization. When the re- 
sult of an operation is used as an operand in a sub- 
sequent calculation, the value is sent immediately to 
its destination register. Yet at the same time, the 
value is put back on a bypass path to the ALU, 
thereby saving the time that otherwise would be re- 
quired to retrieve the value for the next operation. 

7. Bandwidth Optimizations. The 80960SA and 
80960SB get optimal use of their memory bus band- 
width because the bus is tuned for use with the 
cache; the line size of the instruction cache matches 
the maximum burst size for instruction fetches. The 
80960SA and 80960SB automatically fetch four 
words in a burst and store them directly in the 
cache. Due to the size of the cache and the fact that 
it is continually filled in anticipation of needed in- 
structions in the program flow, the 80960SA and 
80960SB are exceptionally insensitive to memory 
wait states. In fact, each wait state causes only a 
10% degradation in system performance. The bene- 
fit is that the 80960SA and 80960SB will deliver out- 
standing performance even with a low cost memory 
system. 

8. Cache Bypass. If there is a cache miss, the proc- 
essor fetches the needed instruction, then sends it 
on to the instruction decoder at the same time it 
updates the cache. Thus, no extra time is taken to 
load and read the cache. 



Memory Space and Addressing Modes 

The 80960SA and 80960SB offer a linear program- 
ming environment so that all programs running on 
the processors are contained in a single address 
space. The maximum size of the address space is 
4 Gigabytes. 

For ease of use, the 80960SA and 80960SB have a 
small number of addressing modes, but include all 
those necessary to ensure efficient compiler imple- 
mentations of high-level languages such as C, For- 
tran and Ada. Table 2 lists the memory addressing 
modes. 



Data Types 

The 80960SA and 80960SB recognize the following 
data types: 

Numeric: 

• 8-, 16-, 32- and 64-bit ordinals 
« 8-, 16-, 32- and 64-bit integers 

• 8-, 16-, 32-, 64- and 80-bit reals 

Non-Numeric: 

• bit 

• ■ bit Field 

• Triple-Word (96 bits) 

• Quad-Word (128 bits) 



Large Register Set 

The following environment of the 80960SA and 
80960SB include a large number of registers. In fact, 
32 registers are available at any time. The availability 
of this many registers greatly reduces the number of 
memory accesses required to execute most pro- 
grams, which leads to greater instruction processing 
speed. 

There are two types of general-purpose registers: 
local and global. The global registers consist of six- 
teen 32-bit registers (GO through G15). These regis- 
ters perform the same function as the general-pur- 
pose registers provided in other popular microproc- 
essors. The term global refers to the fact that these 
registers retain their contents across procedure 
calls. 

The local registers, on the other hand, are proce- 
dure specific. For each procedure call, the 80960SA 
and 80960SB allocate 16 local registers (R0 through 
R15). Each local register is 32 bits wide. 



Multiple Register Sets 

To further increase the efficiency of the register set, 
multiple sets of local registers are stored on-chip. 
This cache holds up to four local register frames, 
which means that up to three procedure calls can be 
made without having to access the procedure stack 
resident in memory. 
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Table 2. Memory Addressing Modes 



• 12-Bit Offset 

• 32-BitOffset 

• Register-Indirect 

• Register + 12-Bit Offset 

• Register + 32-Bit Offset 

• Register + (Index-Register x Scale-Factor) 

• Register x Scale Factor + 32-Bit Displacement 

• Register + (Index-Register x Scale-Factor) + 32-Bit Displacement 

Scale-Factor is 1 , 2, 4, 8 or 1 6 



Although programs may have procedure calls nest- 
ed many calls deep, a program typically oscillates 
back and forth between only two or three levels. As 
a result, with four stack frames in the cache, the 
probability of there being a free frame on the cache 
when a call is made is very high. In fact, runs of 
representative C-language programs show that 80% 
of the calls are handled without needing to access 
memory. 

If there are four or more active procedures and a 
new procedure is called, the processor moves the 



oldest set of local registers in the register cache to a 
procedure stack in memory to make room for a new 
set of registers. Global register G15 is used by the 
processor as the frame pointer (FP) for the proce- 
dure stack. 

Note that the global registers are not exchanged on 
a procedure call, but retain their contents, making 
them available to all procedures for fast parameter 
passing. An illustration of the register cache is 
shown in Figure 4. 
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Figure 4. Multiple Register Sets are Stored On-Chip 
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Instruction Cache 

To further reduce memory accesses, the 80960SA 
and 80960SB include a 512-byte on-chip instruction 
cache. The instruction cache is based on the con- 
cept of locality of reference; that is, most programs 
are not usually executed in a steady stream but con- 
sist of many branches and loops that lead to jumping 
back and forth within the same small section of 
code. Thus, by maintaining a block of instructions in 
a cache, the number of memory references required 
to read instructions into the processor can be greatly 
reduced. 

To load the instruction cache, instructions are 
fetched in 1 6-byte blocks, so that up to four instruc- 
tions can be fetched at one time. 

Code for small loops will often fit entirely within the 
cache, leading to a great increase in processing 
speed since further memory references might not be 
necessary until the program exits the loop. Similarly, 
when calling short procedures, the code for the call- 
ing procedure is likely to remain in the cache, so it 
will be there on the procedure's return. 



Register Scoreboarding 

The instruction decoder has been optimized in sev- 
eral ways. One of these optimizations is the ability to 
do instruction overlapping by means of register 
scoreboarding. 

Register scoreboarding occurs when a LOAD in- 
struction is executed to move a variable from memo- 
ry into a register. When the instruction is initiated, a 
scoreboard bit on the target register is set. When the 
register is actually loaded, the bit is reset. In be- 
tween, any reference to the register contents is ac- 
companied by a test of the scoreboard bit to insure 
that the load has completed before processing con- 
tinues. Since the processor does not have to wait for 
the LOAD to be completed, it can go on to execute 
additional instructions placed in between the LOAD 



instruction and the instruction that uses the register 
contents, as shown in the following example: 

LOAD address 1 , R4 
LOAD address 2, R5 
Unrelated instruction 
Unrelated instruction 
ADD R4, R5, R6 

In essence, the two unrelated instructions between 
the LOAD and ADD instructions are executed for 
free (i.e., take no apparent time to execute) because 
they are executed while the register is being loaded, 
Up to three instructions can be pending at one time 
with three corresponding scoreboard bits set. By ex- 
ploiting this feature, system programmers and com- 
pilers have a useful tool for optimizing execution 
speed. 



Floating-Point Arithmetic 

In the 80960SB, floating-point arithmetic has been 
made an integral part of the architecture. Having the 
floating-point unit integrated on-chip provides two 
advantages. First, it improves the performance of 
the chip for floating-point applications, since no ad- 
ditional bus overhead is associated with floating- 
point calculations, thereby leaving more time for oth- 
er bus operations such as I/O. Second, the cost of 
using floating-point operations is reduced because a 
separate coprocessor chip is not required. 

The 80960SB floating-point (real number) data types 
include single-precision (32-bit), double-precision 
(64-bit) and extended precision (80-bit) floating-point 
numbers, Any register may be used to execute float- 
ing-point operations. 

The processor provides hardware support for both 
mandatory and recommended portions of IEEE 
Standard 754 for floating-point arithmetic, exponen- 
tial, logarithmic and other transcendental functions. 
Table 3 shows execution times for some representa- 
tive instructions. 
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Table 3. Sample Floating-Point 
Execution Times (/xs) at 16 MHz 





32-Bit 


64-Bit 


Add 


0.6 


0.8 


Subtract 


0.6 


0.8 


Multiply 


1.1 


2.0 


Divide 


2.0 


4.5 








Square Root 


5.8 


6.1 


Arctangent 


15.8 


20.5 


Exponent 


17.7 


19.5 


Sine 


23.8 


25.9 


Cosine 


23.8 


25.9 



High Bandwidth Bus 

The 80960SA and 80960SB CPUs reside on a high- 
bandwidth address/data bus. The bus provides a di- 
rect communication path between the processor 
and the memory and I/O subsystem interfaces. The 
processor uses the bus to fetch instructions, manip- 
ulate memory and respond to interrupts. Its features 
include: 



• 16-bit data path multiplexed onto the lower bits of 
the 32-bit address path 

• Eight 1 6-bit half-word burst capacity, which al- 
lows transfers from 1 to 16 bytes at a time 

® High bandwidth reads and writes at 25.6 Mbytes 
per second 

Figure 5 identifies the groups of signals which con- 
stitute the Bus. Table 4 lists the function of the Bus 
and other processor-support signals, such as the in- 
terrupt lines. 



Interrupt Handling 

The 80960SA and 80960SB can be interrupted in 
one of two ways: by the activation of one of four 
interrupt pins or by sending a message on the proc- 
essor's data bus. 

The 80960SA and 80960SB are unusual in that they 
automatically handle interrupts on a priority basis 
and track pending interrupts through their on-chip 
interrupt controller. Two of the interrupt pins can be 
configured to provide 8259A handshaking for expan- 
sion beyond four interrupt lines. 
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Figure 5. 80960SA and 80960SB Bus Signal Groups 
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Debug Features 

The 80960SA and 80960SB have built-in debug ca- 
pabilities. There are two types of breakpoints and six 
different trace modes. The debug features are con- 
trolled by two internal 32-bit registers, the Process- 
Controls Word and the Trace-Controls Word. By set- 
ting bits in these control words, a software debug 
monitor can closely control how the processor re- 
sponds during program execution. 

The 80960SA and 80960SB have both hardware 
and software breakpoints. They provide two hard- 
ware breakpoint registers on-chip which can be set 
by a special command to any value. When the in- 
struction pointer matches the value in one of the 
breakpoint registers, the breakpoint will fire, and a 
breakpoint handling routine is called automatically. 

Tracing is available for all instructions (single-step 
execution), calls and returns and branching. Each 
different type of trace may be enabled separately by 
a special debug instruction. In each case, the 
80960SA and 80960SB execute the instruction first 
and then call a trace handling routine (usually part of 
a software debug monitor). Further program execu- 
tion is halted until the trace routine is completed. 
When the trace event handling routine is completed, 
instruction execution resumes at the next instruc- 
tion. The 80960SA and 80960SB's tracing mecha- 
nisms, which are implemented completely in hard- 
ware, greatly simplify the task of testing and debug- 
ging software. 



application and are often included as part of the op- 
erating system or kernel. 

For each of the ten fault types, there are numerous 
subtypes that provide specific information about a 
fault. For example, a floating-point fault may have its 
subtype set to an Overflow or Zero-Divide fault. The 
fault handler can use this specific information to re- 
spond correctly to the fault. 



BUILT-IN TESTABILITY 

Upon reset, the 80960SA and 80960SB automatical- 
ly conducts an extensive internal test (self-test) of its 
major blocks of logic. Then, before executing its first 
instruction, it does a zero check sum on the first 
eight words in memory to ensure that the system 
has been loaded correctly. If a problem is discov- 
ered at any point during the self-test, the 80960SA 
and 80960SB will indicate a failure and will not begin 
program execution. The self-test takes approximate- 
ly 47,000 cycles to complete, and can be disabled. 

System manufacturers can use the 80960SA and 
80960SB's self-test feature during incoming parts in- 
spection. No special diagnostic programs need to be 
written, and the test is both thorough and fast. The 
self-test capability helps ensure that defective parts 
will be discovered before systems are shipped, and 
once in the field, the self-test makes it easier to dis- 
tinguish between problems caused by processor fail- 
ure and problems resulting from other causes. 




FAULT DETECTION 

The 80960SA and 80960SB have an automatic 
mechanism to handle faults. There are ten fault 
types including trace, arithmetic, and floating-point 
faults. When the processor detects a fault, it auto- 
matically calls the appropriate fault handling routine 
and saves the current instruction pointer and neces- 
sary state information to make efficient recovery 
possible. The processor posts diagnostic informa- 
tion on the type of fault to a Fault Record. Like inter- 
rupt handling routines, fault handing routines are 
usually written to meet the needs of a specific 



CHMOS 

The 80960SA and 80960SB are fabricated using In- 
tel's CHMOS IV (Complementary High Speed Metal 
Oxide Semiconductor) process. This advanced tech- 
nology eliminates the frequency and reliability limita- 
tions of older CMOS processes and opens a new 
era in microprocessor performance. It combines the 
high performance capabilities of Intel's industry- 
leading HMOS technology with the high density and 
low power characteristics of CMOS. The 80960SA 
and 80960SB are available at 10 MHz in both PLCC 
and QFP packages, and at 16 MHz in the PLCC 
package. 
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Table 4. 80960SA and 80960SB Pin Description: Bus Signals 



Symbol 


Type 


Name and Function 


CLK2 


I 


SYSTEM CLOCK provides the fundamental timing for 80960SA and 80960SB 
systems. CLK2 is divided by two inside the 80960SA and 80960SB to generate the 
internal processor clock. 


A31-A16 




T.S. 


ADDRESS BUS carries the upper 16 bits of the 32-bit address to memory. It is valid 
throughout the burst cycle, no latch is required. 


AD15-AD1,D0 


I/O 
T.S. 


ADDRESS/DATA BUS carries the low order 32-bit addresses and 1 6-bit data to and 
from memory. AD15-AD4 must be latched since the cycle following the address 
cycle carries data on the bus. - 


A3-A1 



T.S. 


ADDRESS BUS carries the word addresses of the 32-bit address to memory. These 
three bits are incremented during a burst access indicating the next word address of 
the burst access. Note that A3-A1 are duplicated with AD3-AD1 during the address 
cycle. 


ALE 



T.S. 


ADDRESS LATCH ENABLE indicates the transfer of a physical address. ALE is 
asserted during a Ta cycle and deasserted before the beginning of the following Td 
state. It is active high and floats to a high impedance state during a hold cycle (Th or 
Thr). 


AS 



T.S. 


ADDRESS STATUS indicates an address state. AS is asserted every Ta state and 
deasserted during the following Td state. AS is driven HIGH during reset. 


W/R 



T.S. 


WRITE/READ specifies, during a Ta cycle, whether the operation is write or read. It 
is latched on-chip and remains valid during Td cycles. 


DEN 



T.S. 


DATA ENABLE is asserted during Td cycles and indicates transfer of data on the AD 
lines. The AD lines should not be driven by an external source unless DEN is 
asserted. When DEN is asserted, the outputs from the previous cycle are guaranteed 
to be 3-stated. In addition, DEN deasserted indicates inputs have been captured and 
therefore input hold times can be disregarded. DEN is driven to a HIGH during reset. 


READY 


I 


READY indicates that data on AD lines can be sampled or removed. If READY is not 
asserted during a Td cycle the Td cycle is extended to the next cycle by inserting a 
wait state (Tw). 


DT/R 



T.S. 


DATA TRANSMIT/RECEIVE indicates the direction of the data transfer to and from 
the bus. It is low during Ta and Td cycles for a read or interrupt acknowledgement; it 
is high during Ta and Td cycles for a write. DT/R never changes state when DEN is 
asserted. DT/R is driven HIGH during reset. 




O 
T.S. 


BURST LAST indicates the last data cycle (Td) of a burst access. It is asserted low 
during the last Td and associated Tw cycles in a burst access. 

INITIALIZATION FAILURE indicates that the processor has failed to initialize 
correctly. The failure state is indicated by a combination of BLAST asserted and both 


BLAST/FAIL 


BE signals not asserted. This condition occurs after RESET is deasserted and before 
the first bus transaction begins. FAIL is asserted while the processor performs a self- 
test. If the self-test completes successfully, then FAIL is deasserted. Next, the 
processor performs a zero checksum on the first eight words of memory. If it fails, 
FAIL is asserted for a second time and remains asserted; if it passes, system 
initialization continues and FAIL remains deasserted. 



I/O = Input/Output, O = Output, I = Input, O.D. = Open-Drain, T.S. = 3-State. 
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Table 4. 80960SA and 80960SB Pin Description: Bus Signals (Continued) 



Symbol 


Type 


Name and Function 




I 


RESET clears the internal logic of the processor and causes it to reinitialize. 

During RESET assertion, the input pins are ignored (except for INTO, INT1 , INT3, 
LOCK), the tri-state output pins are placed in a HIGH impedance state (except for 
DT/R, DEN, and AS), and other output pins are placed in their non-asserted state. 


RESET 


RESET must be asserted for at least 41 CLK2 cycles for a predictable reset. 
Optionally, for a synchronous reset, the LOW to HIGH transition of RESET should 
occur after the rising edge of both CLK2 and the external bus clock, and before the 
next rising edge of CLK2. 

The interrupt pins indicate the initializtion sequence executed. Typical initialization 
requires driving only INTO and INT3 to a HIGH state. The reset conditions follow: 

INTO INT1 INT3 LOCK Action Taken 


1x11 Run self-test (core initialization) 

11 Disable self-test 

1 x x Reserved 

x x x Reserved 

x x x ONCE mode (see LOCK pin) 


BE1-BE0 



T.S. 


BYTE ENABLE LINES specify which data bytes (up to two) on the bus take part in 
the current bus cycle. BE1 corresponds to AD1 5-AD8 and BEO corresponds to 
AD7-AD1 , DO. The byte enable lines are asserted appropriately during each data 
cycle. 

INITIALIZATION FAILURE indicates that the processor has failed to initialize 
correctly. The failure state is indicated by a combination of BLAST asserted and 
both BE signals not asserted. This condition occurs after RESET is deasserted and 
before the first bus transaction begins. FAIL is asserted while the processor 
performs a self-test. If the self-test completes successfully, then FAIL is 
deasserted. Next, the processor performs a zero checksum on the first eight words 
of memory. If it fails, FAIL is asserted for a second time and remains asserted; if it 
passes, system initialization continues and FAIL remains deasserted. 


INTO 


I 


INTERRUPT indicates a pending interrupt. The bus interrupt control register 
determines in which way the signal should be interpreted. To signal an interrupt 
request in a synchronous system, this pin (as well as the other interrupt pins) must 
be enabled by being deasserted for at least one bus cycle and then asserted for at 
least one additional bus cycle; in an asynchronous system, the pin must remain 
deasserted for at least two bus cycles and then be asserted for at least two more 
bus cycles. INTO is sampled during RESET to determine if the self-test sequence is 
to be executed. 


INT1 


I 


INTERRUPT 1 indicates a direct interrupt, like INTO. INT1 is sampled during 
RESET to determine if the self-test sequence is to be executed. 


INT2/INTR 


I 


INTERRUPT 2/INTERRUPT REQUEST: The interrupt control register determines 
how this pin is interpreted. If INT2, it has the same interpretation as the INTO and 
INT1 pins. If INTR, it is used to receive an interrupt request from an external 8259A 
compatible interrupt controller. 


INT3/INTA 


I/O 
T.S. 


INTERRUPT 3/INTERRUPT ACKNOWLEDGE: The interrupt control register 
determines how this pin is interpreted. If INT3, it has the same interpretation as the 
INTO and INT1 pins. If INTA, it is used as an output to control interrupt- 
acknowledge bus transactions. The INTA output is latched on-chip and remains 
valid during Td cycles. INT3 must be pulled to a HIGH state during RESET. 



I/O = Input/Output, O = Output, I = Input, O.D. = Open-Drain, T.S. = 3-State. 
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Table 4. 80960SA and 80960SB Pin Description: Bus Signals (Continued) 



Symbol 


Type 


Name and Function 




I/O 
vO.D. 


BUS LOCK prevents other bus masters from gaining control of the bus following the 


LOCK 


current cycle (if they would assert LOCK to do so). LOCK is used by the processor or 
any bus agent when it performs indivisible Read/ Modify/ Write (RMW) operations. Do 
not leave LOCK unconnected. It must be pulled HIGH for the processor to function 
properly. 


For a read that is designated as an RMW-read, LOCK is examined. If asserted, the 
processor waits until it is not asserted; if not asserted, the processor asserts LOCK 
during the Ta cycle and leaves it asserted. 


A write that is designated as an RMW-write deasserts LOCK in the Ta cycle. During 
the time LOCK is asserted, a bus agent can perform a normal read or write but no 
RMW operations. LOCK is also held asserted during an interrupt-acknowledge 
transaction. 


ONCE MODE: The LOCK pin is sampled during reset. If it is asserted LOW at the end 
of RESET, all outputs will be 3-stated until the part is reset. ONCE MODE is used in 
conjunction with an ICE. 


HOLD 


1 


HOLD: HOLD indicates a request from a secondary bus master to acquire the bus. 
When the processor receives HOLD and grants another master control of the bus, it 
floats its tri-state bus lines and then asserts HLDA and enters the Th state. When 
HOLD is deasserted, the processor will deassert HLDA and go to either the Ti or Ta 
state. 


HLDA 



T.S. 


HOLD ACKNOWLEDGE: HLDA indicates that bus control has been relinquished to 
another bus master. This signal is always driven. At RESET it is driven LOW. 


N.C. 


N/A 


NOT CONNECTED indicates pins should not be connected. Never connect any pin 
marked N.C. 



I/O = Input/Output, O = Output, I .= Input, O.D. = Open-Drain, T.S. = 3-State. 



ELECTRICAL SPECIFICATIONS 



Power and Grounding 

The 80960SA and 80960SB are implemented in 
CHMOS IV technology and have modest power re- 
quirements. Their high clock frequency and numer- 
ous output buffers (address/data, control, error, and 
arbitration signals) can cause power surges as multi- 
ple output buffers drive new signal levels simulta- 
neously. For clean on-chip power distribution at high 
frequency, 12 Vcc and 1.3 Vss pins separately feed 
functional units of the 80960SA and 80960SB in the 
package. 

Power and ground connections must be made to all 
power and ground pins of the 80960SA and 
80960SB. On the circuit board, all Vcc P' ns must De 
strapped closely together, preferably on a power 



plane. Likewise, all Vss P' ns should be strapped to- 
gether, preferably on a ground plane. These pins 
may not be connected together within the chip. 



Power Decoupling Recommendations 

Liberal decoupling capacitance should be placed 
near the 80960SA and 80960SB. The processor can 
cause transient power surges when driving the bus, 
particularly when it is connected to a large capaci- 
tive load. 

Low inductance capacitors and interconnects are 
recommended for best high frequency electrical per- 
formance. Inductance can be reduced by shortening 
the board traces between the processor and decou- 
pling capacitors as much as possible. 
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Connection Recommendations 

For reliable operation, always connect unused in- 
puts to an appropriate signal level. In particular, if 
one or more interrupt lines are not used, they should 
be deasserted. No inputs should ever be left float- 
ing. 

All open-drain outputs require a pullup device. While 
in some cases a simple pullup resistor will be ade- 
quate, we recommend a network of pullup and pull- 
down resistors biased to a valid Vih (^2.0V) and 
terminated in the characteristic impedance of the cir- 
cuit board. Figure 6 shows our recommendations for 
the resistor values for both a low and high current 
drive network, which assumes that the circuit board 
has a characteristic impedance of 100ft. The advan- 
tage of terminating the output signals in this fashion 
is that it limits signal swing and reduces AC power 
consumption. 



Characteristic Curves 

The 80960SA and 80960SB characteristic curves 
shown in Figures 7 through 10 supply information 
regarding typical supply currents, typical current ver- 
sus frequency, worst case voltage versus output cur- 
rent on open drain pins and capacitive derating 
curves. 

Figure 7 shows the typical supply current require- 
ments over the operating temperture range of the 



processor at supply voltage (Vcc) of 5V. Figure 8 
shows the typical power supply current (Ice) re- 
quired by the 80960SA and 80960SB at various op- 
erating frequencies when measured at three input 
voltage (Vcc) levels. 

For a given output current (Iol). the curve in Figure 9 
shows the worst case output low voltage (Vol)- Fig- 
ure 10 shows the typical capacitive derating curve 
for the 80960SA and 80960SB measured from 1.5V 
on the system clock (CLK) to 0.8V on the falling 
edge and 2.0V on the rising edge of the bus ad- 
dress/data (AD) signals. 



Test Load Circuit 

Figure 11 illustrates load circuit used to test the 
80960SA and 80960SB's 3-state pins, and Figure 12 
shows the load circuit used to test the open drain 
output. The open drain test uses an active load cir- 
cuit in the form of a matched diode bridge. Since the 
open-drain output sinks current, only the Iol ,e 9S of 
the bridge are necessary and the Ioh Je 9S are not 
used. When the 80960SA and 80960SB driver under 
test is turned off, the output pin is pulled up to Vref 
(i.e., Voh)- Diode D1 is turned off and the Iol current 
source flows through diode D2. 

When the 80960SA and 80960SB open-drain driver 
under test is on, diode D1 is also on, and the voltage 
on the pin being tested drops to Vol- Diode D2 turns 
off and Iql flows through diode D1 . 
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Low Drive Network: 
• Vqh = 2.45V to 3.0V 
o Iql = 9-5 mAto 12 mA 






High Drive Network: 

• Vqh = 2.48V to 3.0V 

• Iql = 16 mAto 20 mA 







Figure 6. Open Drain Connection Recommend ations for 
Low and High Current Drive Networks for the LOCK Pin 
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Figure 7. Typical Supply Current 
vs Supply Voltage 



Figure 9. Worst Case Voltage vs 
Output Current on Open-Drain Pin 



Figure 11. Test Load Circuit for 
3-State Output Pins 
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3-STATE OUTPUT 
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Figure 8. Typical Current vs Frequency 



(Temp = +85°C, V C c =■ 4.5V) 
_ 30 



x 25 

JO 

© 

° 20 

■D 

^ ,5 



3 10 

o 





I 

FALLING 










\^ 












^ RISING 

































20 40 60 80 100 
Capacltlve Load (pF) 



Figure 10. Capacitive Derating Curve 
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Figure 12. Test Load Circuit for 
Open-Drain Output Pins 
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ABSOLUTE MAXIMUM RATINGS 

Operating Temperature 

(PLCC) 0°C to + 100°C Case 

Operating Temperature 

(QFP) 0°C to + 100°C Case 

Storage Temperature - 65°C to + 1 50°C 

Voltage on Any Pin (PLCC) . . . -0.5V to V C c + 0.5V 

Voltage on Any Pin (QFP) . . -0.25V to V C c + 0.25V 

Power Dissipation 1.9W (16 MHz) 



NOTICE: This data sheet contains preliminary infor- 
mation on new products in production. The specifica- 
tions are subject to change without notice. Verify with 
your local Intel Sales office that you have the latest 
data sheet before finalizing a design. 



* WARNING: Stressing the device beyond the "Absolute 
Maximum Ratings" may cause permanent damage. 
These are stress ratings only. Operation beyond the 
"Operating Conditions" is not recommended and ex- 
tended exposure beyond the "Operating Conditions" 
may affect device reliability. 



DC CHARACTERISTICS 

960SA/SB (10 MHz and 16 MHz): T C ASE 



= 0°C to + 100°C, V C c = 5V ±10% unless otherwise noted. 



Symbol 


Parameter 


Min 


Max 


Units 


Conditions 


V|L 


Input Low Voltage 


-0.3 


+ 0.8 


V 




V| H 


Input High Voltage 


2.0 


V C C + 0.3 


V 




V C L 


CLK2 Input Low Voltage 


-0.3 


+ 0.8 


V 




VCH 


CLK2 Input High Voltage 


0.7 V CC 


V C C + 0.3 


V 




Vol 


Output Low Voltage 




0.45 
0.45 
0.60 


V 
V 
V 


Iol = 2.5 mA 


Iql = 12 mA, LOCK Pin 
Iql = 20 mA, LOCK Pin 


V H 


Output High Voltage 


2.4 




V 


AIITS, -2:5mA(4) 


•cc 


Power Supply Current: 
10 MHz— QFP 
10 MHz— PLCC 
16 MHz— PLCC 




280 
280 
350 


mA 
mA 
mA 


TcASE = 0°C(D 
TCASE = 0°C 
TcASE = 0°C 


»LO 


Output Leakage Current 




+ 15 


jmA 


(Note 5) 


Ili 


Input Leakage Current 




±15 


/xA 


<, v <; v cc ( 2 > 


C|N 


Input Capacitance 




10 


PF 


fc = 1 MHz(3) 


Co 


I/O or Output Capacitance 




12 


PF 


fc = 1 MHz(3) 


CCLK 


Clock Capacitance 




10 


PF 


fc = 1 MHz(3) 



NOTES: 

1- Tcase is specified at 0°C to + 100°C for the QFP at 10 MHz and Vcc 

2. INTO has an internal pullup that sources 100 juA. 

3. Input, output and clock capacitance are not tested. 

4. Not measured on open-drain outputs. 

5. Lock has an internal pullup that sources 100 juA 



5V ± 5%. 
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AC SPECIFICATIONS 

This section describes the AC specifications for the 
80960SA and 80960SB pins. All input and output 
timings are specified relative to the 1 .5V level of the 
rising edge of CLK2, and refer to the time at which 



the signal crosses (for output delay and input setup) 
1 .5V. All AC testing should be done with input volt- 
ages of 0.4V and 2.4V, except for the clock (CLK2), 
which should be tested with input voltages of 0.45V 
and 0.7 * V<x- See Figure 1 3 for timing relationships 
for the 80960SA and 80960SB signals. 



CLK2 

OUTPUTS: 

AD(1:15),A(1£3),D0 

A(16:3l),B E(0:t) 

DEN, BLAST 

■ W/R 

HLDA, LOCK, INTA 

ALE 



DT/R 



INPUTS: 

AD(Uj5),D0 

INT0,mj1 

INT2/INTR, INT3 

HOLD 

LOCK 

READY 



A B C D A B CD 



i L 1.5V jU 1.5V n 


' 1.5V 


' if 


1.5V -\ 


' 1.5V 3U 1.5V n 


' 1.5V \- 1.5V 


T6 








T9 




■ 


- 1.5V 


1.5V- 








— T8 - 


















T14 




— T8 


K" 


l, 






T6 








T9 




- 1.5V VALID OUTPUT 1.5V- 








T6AS 






\ 


/ 




,i 


' 






T10 






T11 










& 2.0V J 
c 0.8V ^ 




VALID INPU 


*-< 








^Tov -3 

c 0.8V -a 










T12 




T11 







Figure 13. Drive Levels and Timing Relationships of 80960SA and 80960SB Signals 
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AC Specification Tables 

80960SA and 80960SB AC Characteristics (10 MHz) 



Symbol 


Parameter 


Min 


Max 


Units 


Test Conditions 


T1 


Processor Clock 
Period (CLK2) 


50 


125 


ns 


V| N = 1.5V 


T2 


Processor Clock Low 
Time(CLK2) 


8 




ns 


V T = 10% Point 

= V CL + (V CH - V CL ) X 0.1 


T3 


Processor Clock High 
Time (CLK2) 


8 




ns 


V T = 90% Point 

= V CL + (V CH ~ V CL ) X 0.9 


T4 


Processor Clock Fall 
Time (CLK2) 




10 


ns 


V T = 90% Point to 10% Points 


T5 


Processor Clock Rise 
Time (CLK2) 




10 


ns 


V T = 10% Point to 90% Point(3) 


T6 


Output Valid Delay 


2 


31 


ns 


C L = 100 pF (AD and Control) 


T6AS 


AS Output Valid Delay 


2 


25 


ns 


C L = 50 pF 


T7 


ALE Width 


24 




ns 


C L = 100 pF 


T8 


ALE Output Valid Delay 


4 


33 


ns 


C L =100pF0) 


T9 


Output Float Delay 


2 


20 


ns 


C L = 100pF(AD) 

C L = 100pF(Controls)0) 


T10 


Input Setup 1 


10 




ns 




T11 


Input Hold 


2 




ns 


(Note 4) 


T12 


Input Setup 2 


13 




ns 




T13 


Setup to ALE Inactive 


10 




ns 


C L = 100 pF 


T14 n 


Hold after ALE Inactive 


8 




ns 


C L =100pF 


T15 


RESET Hold 


3 




ns 


(Note 2) 


T16 




5 




ns 


(Note 2) 


RESET Setup 


T17 




2050 




ns 


41 CLK2 Periods Minimum 


RESET Width 




NOTES: 

1. A float condition occurs when the maximum output current becomes less than ILO. Float delay is not tested, but should 
be no long er than t he valid delay. 

2. Meeting RESET setup and hold times is an optional method of synchronizing your clocks. If you decide to use an asyn- 
chronous reset, then synchronizing the clock can be accomplished by using AS. 

3. Processor clock (CLK2) rise time and fall time are not tested. 

4. ICE requires a minimum of 4 ns input hold time. 
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80960SA and 80960SB AC Characteristics (16 MHz PLCC) 


Symbol 


Parameter 


Min 


Max 


Units 


Test Conditions 


T1 


Processor Clock 
Period (CLK2) 


31.25 


125 


ns 


V| N =1.5V. 


T2 


Processor Clock Low 
Time(CLK2) 


8 




ns 


V T = 10% Point 

= V CL + (V C H - V CL ) X 0.1 


T3 


Processor Clock High 
Time (CLK2) 


8 




ns 


V T = 90% Point 

= V CL + (V C H - V CL ) X 0.9 


T4 


Processor Clock Fall 
Time(CLK2) 




10 


ns 


V T = 90% Point to 10% Point(3) 


T5 


Processor Clock Rise 
Time(CLK2) 




10 


ns 


V T = 10% Point to 90% PointO) 


T6 


Output Valid Delay 


2 


25 


ns 


C L *= 100 pF (AD and Control) 


T6AS 


AS Output Valid Delay 


2 


21 


ns 


C L = 50 pF 


T7 


ALE Width 


15 




ns 


C L =100pF 


T8 


ALE Output Valid Delay 


2 


22 


ns 


C L = 100pF(D 


T9 


Output Float Delay 


2 


20 


ns 


C L = 100pF(AD) 

C L = 100pF(Controls)(D 


T10 


Input Setup 1 


10 




ns 




T11 


Input Hold 


2 




ns 


(Note 4) 


T12 


Input Setup 2 


13 




ns 




T13 


Setup to ALE Inactive 


10 




ns 


C L = 100 pF 


T14 


Hold after ALE Inactive 


8 




ns 


C L = 100 pF 


T15 




3 




ns 


(Note 2) 


RESET Hold 


T16 




5 




ns 


(Note 2) 


RESET Setup 


T17 




1281 




ns 


41 CLK2 Periods Minimum 


RESET Width 



NOTES: 

1. A float condition occurs when the maximum output current becomes less than ILO. Float delay is not tested, but should 
be no long er than t he valid delay. 

2. Meeting RESET setup and hold times is an optional method of synchronizing your clocks. If you decide to use an asyn- 
chronous reset, then synchronizing the clock can be accomplished by using AS. 

3. Processor clock (CLK2) rise time and fall time are not tested. 

4. ICE requires a minimum of 4 ns input hold time. 



3-18 



intei. 



80960SA/80960SB 



IPISOIMM? 



CLK 

CLK2 

AS 

A(4:15)/D(0:15) 



l r -\->> 



Ta 



Td 









Ta 



\ I 



Tr 

ho 



y 



L-J 




Tw 



^ 



Ta 
or 

Ti 



READY 



270917-15 

NOTES: 

1. The AD and control signals are driven at all times except during a HOLD acknowledge (HLDA asserted) RESET, and 
ONCE mode. 

2. The AD and control signals may toggle during idle (Ti) or recovery (Tr) cycles. 




Figure 14. Timing Relationships of the 80960SA and 80960SB Bus 
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Initialization Parameters 



T 



T 



T 



T 



. 270917-16 

1. The A edge is defined as the first rising CLK2 edge after RESET is deasserted meeting the RESET hold and setup 
times. . 

2. Initialization Parameters must be setup at least four CLK2s prior to the first A edge. 



Figure 15. RESET Signal Timing 
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Figure 16. HOLD Timing Relationships 



Design Considerations 



Input hold times can be disregarded by the designer 
whenever the input is removed because a subse- 
quent output from the processor is deasserted (e.g., 
DEN becomes deasserted). 

Whenever the processor generates an output that 
indicates a transition into a subsequent state, any 
outputs that are specified to be 3-stated in this new 
state are guaranteed to be 3-stated. For example, in 
the Td cycle following a Ta c ycle for a read, the 
minimum output delay of DEN is 2 ns, but the max- 



imum float time of AD is 20 ns. When DEN is assert- 
ed, however, the AD outputs are guaranteed to have 
been 3-stated. 



Designing for the ICE-960SB 

The 80960SA and 80960SB In-Circuit Emulator as- 
sists in debugging 80960SA and 80960SB hardware 
and software designs. The product consists of a 
probe module, cable, control unit and power supply. 
Because of the high operating frequency of the 
80960SA and 80960SB systems, the probe module 
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connects directly to the 80960SA and 80960SB 
component (EIAJ QFP or PLCC) or a socket for the 
PLCC. 

When designing an 80960SA and 80960SB hard- 
ware system that uses the ICE-960SB to debug the 
system, several electrical and mechanical character- 
istics should be considered. These considerations 
include capacitive loading, drive requirement, power 
requirement, and physical layout. 

The ICE-960SB probe module increases the load 
capacitance of each line by up to 25 pF. This load 
originates from the probe module and are driven by 
the 80960SA and 80960SB processor. 

To achieve high noise immunity, the ICE-960SB 
probe is powered by the user's system. The high- 
speed probe circuitry draws up to 1 .1 A plus the max- 
imum current (Ice) of the 80960SA and 80960SB 
processor. 

The AD bus should not be driven by an external 
source unless DEN is asserted. In addition, the ICE 
requires a minimum data hold time of 4 ns. 



The ICE960SB probe will drive LOCK to a LOW 
state during RESET to force the target 80960SA and 
80960SB to enter ONCE mode. To guarantee tim- 
ings, the ICE requires ±5% supply voltage supplied 
to the 80960SA and 80960SB. The ICE probe re- 
quires a minimum of 0.25 inches clearance on all 
sides of both the EIAJ QFP and PLCC. 



Lock Line Termination 

You must terminate the LOCK line as described in 
Figure 6 in order for the ICE to properly function. 



MECHANICAL DATA 

Package Dimensions and Mounting 

The 80960SA and 80960SB is available in two differ- 
ent packages: an 80-lead quad flat pack (EIAJ QFP), 
shown in Figure 17, and an 84-lead plastic leaded 
chip carrier (PLCC), shown in Figure 18. 



numbered in order from 1 to 84 around the pack- 
age's perimeter. Tables 9 and 10 list the function of 
each pin in the QFP. Tables 11 and 12 list the func- 
tion of each pin in the PLCC. 

Vcc and GND connection must be made to multiple 
Vcc and GND pins. Each Vcc and GND pin must be 
connected to the appropriate voltage or ground and 
externally strapped close to the package. We rec- 
ommend that you include separate power and 
ground planes in your circuit board for power distri- 
bution. 

NOTE: 

Pins identified as N.C., "No Connect," should never 
be connected. The 80960SA and 80960SB QFP 
package contains two N.C. pins and PLCC package 
contains six N.C. pins. 

Package Thermal Specification 

The 80960SA and 80960SB is specified for opera- 
tion when case temperature is within the range 0°C 
to + 85°C. The case temperature should be mea- 
sured at the top center of the package. 

The ambient temperture can be calculated from 0jc 
and 0ja by using the following equations: 

Tj = T C + P * JC 
T A = Tj - P*0 JA 
T C = T A + P * [0JA-0JC] 

Values for 0ja and 0jc are given in Table 7 for the 
QFP package and in Table 8 for the PLCC package 
for various airflows. 

Example: 

. T A = T C - P * (0J A " 0JC) 
Tc = Maximum Case Temperature 

P = Maximum Supply Voltage times Ice 
at 100° and 10 MHz 

0ja and 0jc = QFP Package Thermal Resistance 
at ft/m airflow 

T A = 51 = 100 - (5.5 * 0.213) * (45.7 - 4) 




Pin Assignment 

The QFP and PLCC have different pin assignments. 
The QPF pins are numbered in order from 1 to 80 
around the package's perimeter. The PLCC pins are 



WAVEFORMS 

Figure 1 9 through 22 shows the waveforms for vari- 
ous signals on the 80960SA and 80960SB's bus. 
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Table 7. QFP Package, Thermal Resistance— °C/Watt 





Airflow— ft/min 


Parameter 





50 


100 


200 


400 


600 


800 


0ja Junction to Ambient 
(Case measured in the middle 
of the top of the package) 
(NoHeatsink) 


45.7 


na 


na 


40 


31 


na 


na 


0jc Junction to Case 


4.0 


na 


na 


4.5 


5.5 


na 


na 



NOTES: 

1. This table applies to an 80960SA and 80960SB QFP soldered directly onto a board. 

2.0ja= 0JC+ 0CA- 

3. Thermal data are based on copper lead frames. 

Table 8. PLCC Package, Thermal Resistance— °C/ Watt 





Airflow — ft/min 


Parameter 





50 


100 


200 


400 


600 


800 


1000 


0JA Junction to Ambient 
(No Heatsink) 


33 


na 


na 


27 


23.8 


22 


20 


19.5 


0jc Junction to Case 


13 


na 


na 


na 


na 


na 


na 


na 



NOTES: 

1. This table applies to an 80960SA and 80960SB PLCC soldered directly onto a board. 
2- 0ja= 0JC + 0CA- 
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Figure 18. 84-Lead Plastic Leaded Chip Carrier 



Figure 17. 80-Lead El A J Quad Flat Pack Package 
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Table 9. 80960SA and 80960SB QFP Pinout— In Pin Order 



Pin 


Signal 


1 


A22 


2 


A21 


3 


A20 


4 


A19 


5 


A18 


6 


A17 


7 


A16 


8 


Vcc 


9 


Vss 


10 


AD15 


11 


ADM 


12 


Vcc 


13 


Vss 


14 


AD13 


15 


AD12 


16 


AD11 


17 


AD10 


18 


AD9 


19 


AD8 


20 


AD7 



Pin 


Signal 


21 


Vcc 


22 


Vss 


23 


Vcc 


24 


Vss 


25 


AD6 


26 


AD5 


27' 


AD4 


28 


AD3 


29 


AD2 


30 


AD1 


31 


DO 


32 


Vss 


33 


Vcc 


34 


A3 


35 


A2 


36 


Vcc 


37 


Vss 


28 


A1 


39 


N.C. 


40 


BET 



Pin 


Signal 


41 


BEO 


42 


Vcc 


43 


Vss 


44 


CLK2 


45 


RESET 


46 


INTO 


47 


INT1 


48 


INT2/INTR 


49 


INT3/INTA 


50 


HLDA 


51 


Vcc 


52 


Vss 


53 


HOLD 


54 


W/R 


55 


DEN 


56 


DT/R 


57 


BLAST 


58 


LOCK 


59 


Vcc 


60 


Vss 



Table 10. 80960SA and 80960SB QFP Pinout— In Signal Order 



Signal 


Pin 


A1 


38 


A2 


35 


A3 


34 


AD1 


30 


AD2 


29 


AD3 


28 


AD4 


27 


AD5 


26 


AD6 


25 


AD7 


20 


AD8 


19 


AD9 


18 


AD10 


17 


AD11 


16 


AD12 


15 


AD13 


14 


AD14 


11 


AD15 


10 


A16 


7 


A17 


6 



Signal 


Pin 


A18 


5 


A19 


4 


A20 


3 


A21 


2 


A22 


1 


A23 


80 


A24 


79 


A25 


76 


A26 


75 


A27 


74 


A28 


71 


A29 


70 


A30 


69 


A31 


68 


ALE 


66 


AS 


64 


BEO 


41 


BET 


40 


BLAST 


57 


CLK2 


44 



Signal 


Pin 


DO 


31 


DEN 


55 


DT/R 


56 


HLDA 


50 


HOLD 


53 


INTO 


46 


INT1 


47 


INT2/INTR 


48 


INT3/INTA 


49 


LOCK 


58 


N.C. 


39 


N.C. 


63 


READY 


67 


RESET 


45 


Vcc 


12 


Vcc 


21 


Vcc 


23 


Vcc 


33 


Vcc 


36 


Vcc 


42 



Pin 


Signal 


61 


Vcc 


62 


Vss 


63 


N.C. 


64 


AS 


65 


Vss 


66 


ALE 


67 


READY 


68 


A31 


69 


A30 


70 


A29 


71 


A28 


72 


Vss 


73 


Vcc 


74 


A27 


75 


A26 


76 


A25 


77 


Vcc 


78 


Vss 


79 


A24 


80 


A23 



Signal 


Pin 


Vcc 


51 


Vcc 


59 


Vcc 


61 


Vcc 


73 


Vcc 


77 


Vcc 


8 


Vss 


13 


Vss 


22 


Vss 


24 


Vss 


32 


v ss 


37 


Vss 


43 


Vss 


52 


Vss 


60 


Vss 


62 


Vss 


72 


Vss 


78 


Vss 


9 


Vss 


65 


W/R 


54 
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Table 1 1. 80960SA and 80960SB PLCC Pinout— In Pin Order 



Pin 


Signal 


1 


v C c 


2 


N.C. 


3 


A27 


4 


A26 


5 


A25 


6 


Vcc 


■7 


v S s 


8 


A24 


9 


A23 


10 


A22 


11 


A21 


12 


A20 


13 


A19 


14 


A18 


15 


A17-. 


16 


A16 


17 


Vcc 


18 


Vss 


19 


AD15 


20 


AD14 


21 


Vcc 



Pin 


Signal 


22 


v S s 


23 


N.C. 


24 


AD13 


25 


AD12 


26 


AD11 


27 


AD10 


28 


AD9 


29 


AD8 


30 


AD7 


31 


Vcc 


32 


v ss 


33 


Vcc 


34 


v ss 


35 


AD6 


36 


AD5 


37 


AD4 


38 


AD3 


39 


AD2 


40 


AD1 


41 


DO 


42 


N.C. 



Pin 


Signal 


43 


Vss 


44 


Vcc 


45 


A3 


46 


A2 


47 


Vcc 


48 


v ss 


49 


A1 


50 


N.C. 


51 


BET 


52 


BEO 


53 


Vcc 


54 


v ss 


55 


CLK2 


56 


RESET 


57 


INTO 


58 


TNTT 


59 


INT2/INTR 


60 


INT3/INTA 


61 


HLDA 


62 


Vcc 


63 


Vss 



Pin 


Signal 


64 


HOLD 


65 


N.C. 


66 


W/R 


67 


DEN 


68 


DT/R 


69 




BLAST 


70 




LOCK 


. 71 


Vcc , 


72 


Vss 


73 


Vcc 


74 


Vss 


75 


N.C. 


76 


AS 


77 


Vss 


78 


ALE 


79 




READY 


80 


A31 


81 


A30 


8? 


A29 


83 


A28 


84 


Vss 
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Table 12. 80960SA and 80960SB PLCC Pinout— In Signal Order 



Signal 


Pin 


A1 


49 


A2 


46 


A3 


45 


DO 


41 


AD1 


40 


AD2 


39 


AD3 


38 


AD4 


37 


AD5 


36 


AD6 


35 


AD7 


30 


AD8 


29 


AD9 


28 


AD10 


27 


AD11 


26 


AD12 


25 


AD13 


24 


AD14 


20 


AD15 


19 


AD16 


16 


A17 


15 



Signal 


Pin 


A18 


14 


A19 


13 


A20 


12 


A21 


11 


A22 


10 


A23 


9 


A24 


8 


A25 


5 


A26 


4 


A27 


3 


A28 


83 


A29 


82 


A30 


81 


A31 


80 


ALE 


78 


AS 


76 


BEO 


52 


BET 


51 




69 


BLAST 


CLK2 


55 


DEN 


67 



Signal 


Pin 
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Figure 19. Basic 80960SA and 80960SB Timing 
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Figure 20. 80960SA and 80960SB Timing 
Showing a Four Word Aligned Read Burst 
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Figure 21. 80960SA and 80960SB Double Word Read Timing with Wait States 
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Figure 22. 80960SA and 80960SB Aligned Double Word Write Timing with Wait States 
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Figure 23. 80960SA 80960SB Timing with a Four Word Read Burst Misaligned by One Byte 
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Figure 24. 80960SA and 80960SB Timing with a Three Word Write Burst 
Misaligned by One Byte and One Wait State 
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INTRODUCTION 

This chapter provides an overview of the Intel i960 KB 
processor (which is part of the i960 K series of embed- 
ded-processor products). 

All of the processors in the i960 K series of products 
are based on the Intel i960TM architecture. Most of the 
information in this overview also applies to the i960 
KA processor. The only difference between the i960 
KB and i960 KA processors is that the i960 KA proc- 
essor does not provide on-chip support for floating- 
point operations or operations on decimal numbers. 



OVERVIEW OF THE J960TM KB 
ARCHITECTURE 

The i960 KB processor introduced the i960 architec- 
ture — a new 32-bit architecture from Intel. This archi- 
tecture has. been designed to meet the needs of embed- 
ded applications such as machine control, robotics, 
process control, avionics and instrumentation. 

The i960 architecture can best be characterized as a 
high-performance computing engine. It features high- 
speed instruction execution and ease of programming. 
It is also easily extensible, allowing processors and con- 
trollers based on this architecture to be conveniently 
customized to meet the needs of specific processing and 
control applications. 

The following are some of the important attributes of 
the i960 architecture: 

° full 32-bit registers 

• high-speed, pipelined instruction execution 

• a convenient program execution environment with 
32 general-purpose registers and a versatile set of 
special-function registers 

• a highly optimized procedure call mechanism that 
features on-chip caching of local variables and pa- 
rameters 

• extensive facilities for handling interrupts and faults 

• extensive tracing facilities to support efficient pro- 
gram debugging and monitoring 

• register scoreboarding and write buffering to permit 
efficient operation when used with lower perform- 
ance memory subsystems 



OVERVIEW OF THE SINGLE 
PROCESSOR SYSTEM 
ARCHITECTURE 

The central processing module, memory module and 
I/O module form the natural boundaries for the hard- 
ware system architecture. The modules are connected 
together by the high bandwidth 32-bit multiplexed 
L-bus, which can transfer data at a maximum sustained 
rate of 53 Mbytes per second for an i960 processor op- 
erating at 20 MHz. 

Figure 1 shows a simplified block diagram of one possi- 
ble system configuration. The heart of this system is the 
i960 KB processor, which fetches instructions, executes 
code, manipulates stored information and interacts 
with I/O devices. The high bandwidth L-bus connects 
the i960 KB processor to memory and I/O modules. 
The i960 KB processor stores system data, instructions 
and programs in the memory module. By accessing var- 
ious peripheral devices in the I/O module, the i960 KB 
processor supports communication to terminals, mo- 
dems, printers, disks and other I/O devices. 



i960tm KB Processor and the L-Bus 

The i960 KB processor performs bus operations using 
multiplexed address and data signals, and provides all 
the necessary control signals. For example st andar d 
control signals, such as Add ress Latch Enable (ALE), 
Address/Data Status (ADS), Write/Read Command 
(W/ R), Da ta Transmit/Receive (DT/R) and Data En- 
able (DEN), are provided by the i960 KB processor. 
The i960 processor also generates byte enable signals 
that specify which bytes on the 32-bit data lines are 
valid for the transfer. 

The L-bus supports burst transactions, which access up 
to four data words at a maximum rate of one word per 
clock cycle. The i960 KB processor uses the two low- 
order address lines to indicate how many words are to 
be transferred. The i960 KB processor performs burst 
transactions to load the on-chip 51 2-byte instruction 
cache to minimize memory accesses for instruction 
fetches. Burst transactions can also be used for data 
access. 

To transfer control of the bus to an external bus master, 
the i960 KB provides two arbitration signals: hold re- 
quest (HOLD) and hold acknowledge (HLDA). After 
receiving HOLD, the processor grants control of the 
bus to an external master by asserting HLDA. 
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Figure 1. Basic i960TM KB System Configuration 



The i960 KB processor provides a flexible interrupt 
structure by using an on-chip interrupt controller, an 
external interrupt controller or both. The type of inter- 
rupt structure is specified by an internal interrupt vec- 
tor register. For a system with multiple processors, 
another method is available, called inter-agent commu- 
nication (IAC) where a processor can interrupt another 
processor by sending an IAC message. 



Memory Module 

A memory module can consist of a memory controller, 
Erasable Programmable Read Only Memory 
(EPROM), and static or dynamic Random Access 
Memory (RMA). The memory controller first condi- 
tions the L-bus signals for memory operation. It demul- 
tiplexes the address and data lines, generates the chip 
select signals from the address, detects the start of the 
cycle for burst mode operation and latches the byte 
enable signals. 

The memory controller generates the control signals for 
EPROM, SRAM and DRAM. Specifically, it provides 
the control signals, multiplexed row/column address 
and refresh control for dynamic RAMs. The controller 



can be designed to accommodate the burst transaction 
of the i960 KB processor by using the static column 
mode or nibble mode features of the dynamic RAM. In 
addition to supplying the operational signals, the con- 
troller generates the READY signal to indicate that 
data can be transferred to or from the i960 KB proces- 
sor. 

The i960 KB processor directly addresses up to 
4 Gbytes of physical memory. The processor does not 
allow burst accesses to cross a 16-byte boundary, to 
ease the design of the controller. Each address specifies 
a four-byte data word within the block. Individual data 
bytes can be accessed by using the four byte-enable sig- 
nals from the i960 KB processor. Chapter 5 provides 
design guidelines for the memory controller. 



I/O Module 

The I/O module consists of the I/O components and 
the interface circuit. I/O components can be used to 
allow the i960 KB processor to use most of its clock 
cycles for computational and system management ac- 
tivities. Time consuming tasks can be off-loaded to spe- 
cialized slave-type components, such as the 8259A Pro- 
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grammable Interrupt Controller or the 82530 Serial 
Communication Controller. Some tasks may require a 
master-type component, such as the 82586 Local Area 
Network Control. 

The interface circuit performs several functions. It de- 
multiplexes the address and data lines, generates the 
chip select signals from the address, produces the I/O 
read or I/O write command from the processor's W/R 
signal, latches the byte enable signals and generates the 
READY signals. Since some of these functions are 
identical to those of the memory controller, the same 
logic can be used for both interfaces. For master-type 
peripherals that operate on a 16-bit data bus, the inter- 
face circuit translates the 32-bit data bus to a 16-bit 
data bus. 

The i960 KB processor uses memory-mapped addresses 
to access I/O devices. This allows the CPU to use many 
of the same instuctions to exchange information for 
both memory and peripheral devices. Thus, the power- 
ful memory-type instructions can be used to perform 8-, 
16- and 32-bit data transfers. 



HIGH PERFORMANCE PROGRAM 
EXECUTION 

Much of the design of the i960 architecture has been 
aimed at maximizing the processor's computational 
and data processing speed through the use of increased 
parallelism. The following paragraphs describe several 
of the mechanisms and techniques used to accomplish 
this goal. 



Load and Store Model 

One of the more important features of the i960 archi- 
tecture is its performance of most operations on oper- 
ands in registers, rather than in memory. For example, 
all arithmetic, logic, comparison, branching and bit op- 
erations are performed with registers and literals. 

This feature provides two benefits. First, it increases 
program execution speed by minimizing the number of 
memory accesses necessary to execute a program. Sec- 
ond, it reduces the memory latency encountered when 
using slower, lower-cost memory parts. 

To support this concept, the architecture provides a 
generous supply of general-purpose registers. For each 
procedure, 32 registers are available, 28 of which are 
available for general use. These registers are divided 
into two types: global and local. Both types of registers 
can be used for general storage of operands. The only 
difference is that global registers retain their contents 
across procedure boundaries, whereas the processor al- 
locates a new set of local registers each time a new 
procedure is called. 



The architecture also provides a set of fast, versatile 
load and store instructions. These instructions allow 
burst transfers of 1, 2, 4, 8, 12 or 16 bytes of informa- 
tion between memory and the registers. 



On-Chip Caching of Code and Data 

To further reduce memory accesses, the architecture 
offers two mechanisms for caching code and data on 
chip: an instruction cache and multiple sets of local 
registers. The instruction cache allows prefetching of 
blocks of instruction from memory. This helps ensure 
that the instruction execution pipeline is supplied with 
a steady stream of instructions. It also reduces the 
number of memory accesses required when performing 
iterative operations such as loops. The architecture al- 
lows the size of the instruction cache to vary. For the 
i960 KB processor, it is 512 bytes. 

To optimize the architecture's procedure call mehan- 
ism, the processor provides multiple sets of local regis- 
ters. This allows the processor to perform procedure 
calls without having to write the local registers out to 
the stack in memory. The number of register sets de- 
pends on the processor implementation. The i960 KB 
processor provides four sets of local registers. 



Overlapped Instruction Execution 

The i960 architecture also enchances program execu- 
tion speed by overlapping the execution of some in- 
structions. In the i960 K series of processors, this is 
accomplished through register scoreboarding. 

Register scoreboarding permits instruction execution to 
continue while data is being fetched from memory. 
When a load instruction is executed, the processor sets 
one or more scoreboard bits to indicate the target regis- 
ters to be loaded. After the target registers are loaded, 
the scoreboard bits are cleared. While the target regis- 
ters are being loaded, the processor is allowed to exe- 
cute other instructions that do not use these registers. 

The processor uses the scoreboard bits to ensure that 
the target registers are not used until the load is com- 
plete. (Scoreboard bits are checked transparently from 
software.) This technique allows code to be executed 
such that some instructions can be executed in zero 
clock cycles (that is, executed for free). 



Single-Clock Instructions 

The i960 architecture is designed to let a processor exe- 
cute commonly used instructions, such as moves, adds, 
subtracts, logical operations and branches, in a mini- 
mum number of clock cycles (preferably one cycle). 
The architecture supports this concept in several 
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ways. For example, the load and store model described 
earlier eliminates the clock cycles required to perform 
memory-to-memory operations, by concentrating on 
register-to-register operations. 

In addition, all of the instructions in the i960 architec- 
ture are 32 bits long and aligned on 32-bit boundaries. 
This lets instructions be decoded in one clock cycle, 
and eliminates the need for an instruction-alignment 
stage in the pipeline. 

The i960 KB processor takes full advantage of these 
features of the architecture, resulting in more than 50 
instructions that can be executed in a single clock cycle. 



Efficient Interrupt Model 

The i960 architecture provides an efficient mechanism 
for servicing interrupts from external sources. To han- 
dle interrupts, the processor maintains an interrupt ta- 
ble of 248 interrupt vectors, 240 of which are available 
for general use. When an interrupt is signaled, the proc- 
essor uses a pointer to the interrupt table to perform an 
implicit call to an interrupt handler procedure. In per- 
forming this call, the processor automatically saves the 
state of the processor prior to receiving the interrupt, 
performs the interrupt routine, then restores the state of 
the processor. A separate interrupt stack is also provid- 
ed to segregate interrupt handling from application 
programs. 

The interrupt handling facilities also allow interrupts to 
be evaluated by priority. The processor is then able to 
store interrupt vectors that are lower in priority than 
the current processor task in a pending interrupt sec- 
tion of the interrupt table. The processor checks and 
services the pending interrupts at defined times. 



SIMPLIFIED PROGRAMMING 
ENVIRONMENT 

Because of its streamlined execution environment, 
processors based on the i960 architecture are particu- 
larly easy to program. The following paragraphs de- 
scribe some of the architecture features that simplify 
programming. 



Highly Efficient Procedure Call 
Mechanism 

The procedure call mechanism makes procedure calls 
and parameter passing between procedures simple and 
compact. Each time a call instruction is issued, the 
processor automatically saves the current set of local 
registers and allocates a new set for the called proce- 
dure. Likewise, on a return from a procedure, the cur- 
rent set of local registers is deallocated and the local 



registers for the procedure being returned to are re- 
stored. This means a program never has to explicitly 
save and restore those local variables that are stored in 
local registers. 



Versatile Instruction Set and 
Addressing 

The selection of instructions and addressing modes also 
simplifies programming. A full set of load, store, move, 
arithmetic, comparison and branch instructions are 
provided, with operations on both integer and ordinal 
data types. Operations on bits and bit strings are simpli- 
fied by a complete set of Boolean and bit-field instruc- 
tions. 

The addressing modes are efficient and straightforward, 
while at the same time providing the necessary indexing 
and scaling modes required to address complex arrays 
and record structures. The large 4-gigabyte address 
space provides ample room to store programs and data. 
The availability of 32 addressing lines allows some ad- 
dress lines to be memory-mapped to control hardware 
functions. 



Extensive Fault Handling Capability 

To aid in program development, the i960 architecture 
defines a wide range of faults that the processor detects, 
including, arithmetic, faults, invalid operations, invalid 
operands and machine faults. When a fault is detected, 
the processor makes an implicit call to a fault handler 
routine, in a way similar to the interrupt mechanism 
described previously. The information collected for 
each fault allows program developers to quickly correct 
faulting code, and allows automatic recovery from 
some faults. 



Debugging and Monitoring 

To support debugging systems, the i960 architecture 
provides a mechanism for monitoring processor activity 
by means of trace events. When the processor detects a 
trace event, it signals a trace fault and calls a fault han- 
dler. Intel provides several tools that use this feature, 
including an in-circuit emulator (ICE) device. 



SUPPORT FOR ARCHITECTURAL 
EXENSIONS 

The i960 architecture provides several features that en- 
able processors based on this architecture to be easily 
customized to meet the needs of specific embedded ap- 
plications, such as signal processing, array processing 
or graphics processing. 
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The most important of these features is the set of 32 
special function registers. These regisers provide a con- 
venient interface to circuitry in the processor or pins 
that can be connected to external hardware. They can 
be used to control timers, to perform operations on spe- 
cial data types or to perform I/O functions. The special 
function registers are similar to the global registers. 
They can be addressed by all of the register access in- 
structions. 



EXTENSIONS INCLUDED IN THE 
J960TM K SERIES PROCESSORS 

The i960 K series of processors provides a complete 
implementation of the i960 architecture, plus several 
extensions to that architecture. These extensions fall 
into two categories: floating-point processing and inter- 
agent communication. 



On-Chip Floating Point 

The i960 KB processor provides a complete implemen- 
tation of the IEEE standard for binary floating-point 
arithmetic (IEEE 754-185). This implementation in- 
cludes a full set of floating-point operations, includ- 



ing add, subtract, multiply, divide, trigonometric func- 
tions and logarithmic functions. These operations are 
performed on single precision (32-bit), double precision 
(64-bit) and extended precision (80-bit) real numbers. 

One of the benefits of this implementation is that the 
floating-point handling facilities are integrated into the 
normal instruction execution environment. Single and 
double precision floating-point values are stored in the 
same registers as non-floating point values. Four 80-bit 
floating-point registers are provided to hold extended- 
precision values. 



Interagent Communication 

All of the processors in the i960 K series provide an 
inter-agent communication (IAC) mechanism, allowing 
agents connected to the processor's bus to communi- 
cate with one another. This mechanism operates simi- 
larly to the interrupt mechanism, except that IAC mes- 
sages are passed through dedicated sections of memory. 
The sort of tasks handled with IAC messages are proc- 
essor reinitialization, stopping the processor, purging 
the instruction cache and forcing the processor to check 
pending interrupts. 
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EMBEDDED 32-BIT PROCESSOR 



■ High-Performance Embedded 
Architecture 

— 25 MIPS Burst Execution at 25 MHz 
-—9.4 MIPS* Sustained Execution at 

25 MHz 

■ 512-Byte On-Chip Instruction Cache 

— Direct Mapped 

— Parallel Load/Decode for Uncached 
Instructions 

m Pin Compatible with 80960KB 

■ Multiple Register Sets 

— Sixteen Global 32-Bit Registers 

— Sixteen Local 32-Bit Registers 
— ■ Four Local Register Sets Stored 

On-Chip 

— Register Scoreboarding 

The 80960KA is a member of Intel's new 32-bit processor family, the i960 series, which is designed especially 
for embedded applications. It is based on the family's high performance, common core architecture, and 
includes a 512-byte instruction cache and a built-in interrupt controller. The 80960KA has a large register set, 
multiple parallel execution units and a high-bandwidth, burst bus. Using advanced RISC technology, this high 
performance processor is capable of execution rates in excess of 9.4 million instructions per second.* The 
80960KA is well-suited for a wide range of embedded applications, including laser printers, image processing, 
industrial control, robotics and telecommunications. 

♦Relative to Digital Equipment Corporation's VAX-1 1 /780** at 1 MIPS 



m Built-in Interrupt Controller 

— 32 Priority Levels 256 Vectors 

— 3.4 jus Latency @ 25 MHz 

m Easy to Use, High Bandwidth 32-Bit Bus 

— 66.7 Mbytes/s Burst 

— Up to 16-Bytes Transferred per Burst 

m 4 Gigabyte, Linear Address Space 

'n 132-Lead Pin Grid Array (PGA) Package 

n 132-Lead Plastic Quad Flat Pack (PQFP) 

m Uses 85C960 Bus Controller 

rj Supported by 27960KX Burst EPROMs 
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Figure 1. The 80960KA's Highly Parallel Microarchitecture 



** VAX-1 1tm is a trademark of Digital Equipment Corporation. 
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THE 960 SERIES 

The 80960KA is a member of a new family of 32-bit 
microprocessors from Intel known as the i960 Se- 
ries. This series was especially designed to serve 
the needs of embedded applications. The embed- 
ded market includes applications as diverse as in- 
dustrial automation, avionics, image processing, 
graphics, robotics, telecommunications and automo- 
biles. These types of applications require high 
integration, low power consumption, quick interrupt 
response times and high performance. Since time to 
market is critical, embedded microprocessors need 
to be easy to use in both hardware and software 
designs. 



All members of the 80960 series share a common 
core architecture which utilizes RISC technology so 
that, except for special functions, the family mem- 
bers are object code compatible. Each new proces- 
sor in the series will add its own special set of func- 
tions to the core to satisfy the needs of a specific 
application or range of applications in the embedded 
market. For example, future processors may include 
a DMA controller, a timer or an A/D converter. 

Software written for the 80960KA will run without 
modification on any other member of the 80960 fam- 
ily. It is also pin-compatible with the 80960KB, which 
includes an integrated floating-point unit, and the 
80960MC, a military-grade version with support for 
multitasking, memory management, multiprocessing 
and fault tolerance. 
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Figure 2. Register Set 



3-35 



iny. 



80960KA 



KEY PERFORMANCE FEATURES 

The 80960KA's architecture is based on the most 
recent advances in RISC technology and is ground- 
ed in Intel's long experience in designing embedded 
controllers. Many features ■• contribute to the 
80960KA's exceptional performance: 

1. Large Register Set. Modern compilers can take 
advantage of a large number of registers to optimize 
execution speed. For maximum flexibility, the 
80960KA provides 32 32-bit registers and four 80-bit 
floating-point registers. (See Figure 2.) 

2. Fast Instruction Execution. Simple functions 
make up the bulk of instructions in most programs, 



so that execution speed can be greatly improved by 
ensuring that these core instructions execute in as 
short a time as possible. The most-frequently exe- 
cuted instructions such as register-register moves, 
add/subtract, logical operations, and shifts execute 
in one to two cycles (Table 1 contains a list of in- 
structions.) 

3. Load/Store Architecture. One way to improve 
execution speed is to reduce the number of times 
that the processor must access memory to perform 
an operation. Like other processors based on RISC 
technology, the 80960KA has a Load/Store archi- 
tecture, only the LOAD and STORE instructions ref- 
erence memory; all other instructions operate on 
registers. 
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Figure 3. Instruction Formats 
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Table 1. 80960KA Instruction Set 
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4. Simple Instruction Formats. All instructions in 
the 80960KA are 32-bits long and must be aligned 
on word boundaries. This alignment makes it possi- 
ble to eliminate the instruction-alignment stage in 
the pipeline. To simplify the instruction decoder fur- 
ther, there are only five instruction formats and each 
instruction uses only one format. (See Figure 3.) 

5. Overlapped Instruction Execution. A load oper- 
ation allows execution of subsequent instructions to 
continue before the data has been returned from 
memory, so that these instructions can overlap the 
load. The 80960KA manages this process transpar- 
ently to software through the use of a register score- 
board. Conditional instructions also make use of a 
scoreboard so that subsequent unrelated instruc- 
tions can be executed while the conditional instruc- 
tion is pending. 

6. Integer Execution Optimization. When the re- 
sult of an operation is used as an operand in a sub- 
sequent calculation, the value is sent immediately to 
its destination register. Yet at the same time, the 
value is put back on a bypass path to the ALU, 
thereby saving the time that otherwise would be re- 
quired to retrieve the value for the next operation. 

7. Bandwidth Optimizations. The 80960KA gets 
optimal use of its memory bus bandwidth because 
the bus is tuned for use with the cache: the line size 
of the instruction cache matches the maximum burst 
size for instruction fetches. The 80960KA automati- 
cally fetches four words in a burst and stores them 
directly in the cache. Due to the size of the cache 
and the fact that it is continually filled in anticipation 
of needed instructions in the program flow, the 
80960KA is exceptionally insensitive to memory wait 
states. In fact, each wait state causes only a "7% 
degradation in system perfomance. The benefit is 
that the 80960KA will deliver outstanding perform- 
ance even with a low cost memory system. 

8. Cache Bypass. If there is a cache miss, the proc- 
essor fetches the needed instruction, then sends it 
on to the instruction decoder at the same time it 
updates the cache. Thus, no extra time is taken to 
load and read the cache. 



Memory Space and Addressing Modes 

The 80960KA offers a linear programming environ- 
ment so that all programs running on the processor 
are contained in a single address space. The maxi- 
mum size of the address space is 4 Gigabytes (2 32 
bytes). 

For ease of use, the 80960KA has a small number of 
addressing modes, but includes all those necessary 



to ensure efficient compiler implementations of high- 
level languages such as C, Fortran and Ada. Table 2 
lists the memory addressing modes. 



Data Types 

The 80960KA recognizes the following data types: 

Numeric: 

• 8-, 16-, 32- and 64-bit ordinals 

• 8-, 16, 32- and 64-bit integers 

Non-Numeric: 

• Bit 

• Bit Field 

• Triple-Word (96 bits) 

• Quad-Word (128 bits) 



Large Register Set 

The programming environment of the 80960KA in- 
cludes a large number of registers. In fact, 32 regis- 
ters are available at any time. The availability of this 
many registers greatly reduces the number of mem- 
ory accesses required to execute most programs, 
which leads to greater instruction processing speed. 

There are two types of general-purpose registers: 
local and global. The global registers consist of six- 
teen 32-bit registers (GO through G 15) These regis- 
ters perform the same function as the general-pur- 
pose registers provided in other popular microproc- 
essors. The term global refers to the fact that these 
registers retain their contents across procedure 
calls. 

The local registers, on the other hand, are proce- 
dure specific. For each procedure call, the 80960KA 
allocates 16 local registers (RO through R15). Each 
local register is 32 bits wide. 



Multiple Register Sets 

To further increase the efficiency of the register set, 
multiple sets of local registers are stored on-chip. 
This cache holds up to four local register frames, 
which means that up to three procedure calls can be 
made without having to access the procedure stack 
resident in memory. 

Although programs may have procedure calls nest- 
ed many calls deep, a program typically oscillates 
back and forth between only two or three levels. As 
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Table 2. Memory Addressing Modes 



o 12-Bit Offset 

o 32-Bit Offset 

° Register-Indirect 

o Register + 12-Bit Offset 

• Register + 32-Bit Offset 

• Register + (Index-Register x Scale-Factor) 

o Register x Scale Factor + 32-Bit Displacement 

'• Register ■+ (Index-Register x Scale-Factor) + 32-Bit Displacement 

Scale-Factor is 1 , 2, 4, 8 or 1 6 



a result, with four stack frames in the cache, the 
probability of there being a free frame on the cache 
when a call is made is very high. In fact, runs of 
representative C-language programs show that 80% 
of the calls are handled without needing to access 
memory. 

If there are four or more active procedures and a 
new procedure is called, the processor moves the 
oldest set of local registers in the register cache to a 



procedure stack in memory to make room for a new 
set of registers. Global register G15 is used by the 
processor as the frame pointer (FP) for the proce- 
dure stack. 

Note that the global registers are not exchanged on 
a procedure call, but retain their contents, making 
them available to all procedures for fast parameter 
passing. An illustration of the register cache is 
shown in Figure 4. 
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Figure 4. Multiple Register Sets Are Stored On-Chip 
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Instruction Cache 

To further reduce memory accesses, the 80960KA 
includes a 512-byte on-chip instruction cache. The 
instruction cache is based on the concept of locality 
of reference; that is, most programs are not usually 
executed in a steady stream but consist of many 
branches and loops that lead to jumping back and 
forth within the same small section of code. Thus, by 
maintaining a block of instructions in a cache, the 
number of memory references required to read in- 
structions into the processor can be greatly reduced. 

To load the instruction cache, instructions are 
fetched in 1 6-byte blocks, so that up to four instruc- 
tions can be fetched at one time. An efficient 
prefetch algorithm increases the probability that an 
instruction will already be in the cache when it is 
needed. 

Code for small loops will often fit entirely within the 
cache, leading to a great increase in processing 
speed since further memory references might not be 
necessary until the program exits the loop. Similarly, 
when calling short procedures, the code for the call- 
ing procedure is likely to remain in the cache, so it 
will be there on the procedure's return. 



Register Scoreboarding 

The instruction decoder has been optimized in sev- 
eral ways. One of these optimizations is the ability to 



do instruction overlapping by means of register 
scoreboarding. 

Register scoreboarding occurs when a LOAD in- 
struction is executed to move a variable from memo- 
ry into a register. When the instruction is initiated, a 
scoreboard bit on the target register is set. When the 
register is actually loaded, the bit is reset. In be- 
tween, any reference to the register contents is ac- 
companied by a test of the scoreboard bit to insure 
that the load has completed before processing con- 
tinues. Since the processor does not have to wait for 
the LOAD to be completed, it can go on to execute 
additional instructions placed in between the LOAD 
instruction and the instruction that uses the register 
contents, as shown in the following example: 

LOAD R4, address 1 
LOAD R5, address 2 
Unrelated instruction 
Unrelated instruction 
ADD R4, R5, R6 

In essence, the two unrelated instructions between 
the LOAD and ADD instructions are executed for 
free (i.e., take no apparent time to execute) because 
they are executed while the register is being loaded. 
Up to three instructions can be pending at one time 
with three corresponding scoreboard bits set. By ex- 
ploiting this feature, system programmers and com- 
pilers have a useful tool for optimizing execution 
speed. 
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High Bandwidth Local Bus 

An 80960KA CPU resides on a high-bandwidth ad- 
dress/data bus known as the local bus (L-Bus). The 
L-Bus provides a direct communication path be- 
tween the processor and the memory and I/O sub- 
system interfaces. The processor uses the local bus 
to fetch instructions, manipulate memory, and re- 
spond to interrupts. Its features include: 

• 32-bit multiplexed address/data path 

• Four-word burst capability, which allows transfers 
from 1 to 16 bytes at a time 

• High bandwidth reads and writes at 66.7 Mbytes 
per second 

• Special signal to indicate whether a memory 
transaction can be cached 

Figure 5 identifies the groups of signals which con- 
stitute the L-Bus. Table 4 lists the function of the L- 
Bus and other processor-support signals, such as 
the interrupt lines. 



Interrupt Handling 

The 80960KA can be interrupted in one of two ways: 
by the activation of one of four interrupt pins or by 
sending a message on the processor's data bus. 

The 80960KA is unusual in that it automatically han- 
dles interrupts on a priority basis and tracks pending 
interrupts through its on-chip interrupt controller. 
Two of the interrupt pins can be configured to pro- 
vide 8259A handshaking for expansion beyond four 
interrupt lines. 



Debug Features 

The 80960KA has built-in debug capabilities. There 
are two types of breakpoints and six different trace 
modes. The debug features are controlled by two 
internal 32-bit registers, the Process-Controls Word 
and the Trace-Controls Word. By setting bits in 
these control words, a software debug monitor can 
closely control how the processor responds during 
program execution. 

The 80960KA has both hardware and software 
breakpoints. It provides two hardware breakpoint 
registers on-chip which can be set by a special com- 
mand to any value. When the instruction pointer 
matches the value in one of the breakpoint registers, 
the breakpoint will fire, and a breakpoint handling 
routine is called automatically. 

The 80960 KA also provides software breakpoints 
through the use of two instructions, MARK and 
FMARK. These instructions can be placed at any 
point in a program and will cause the processor to 
halt execution at that point and call the breakpoint 
handling routine. The breakpoint mechanism is easy 
to use and provides a powerful debugging tool. 

Tracing is available for instructions (single-step exe- 
cution), calls and returns, and branching. Each dif- 
ferent type of trace may be enabled separately by a 
special debug instruction. In each case, the 
80960KA executes the instruction first and then 
calls a trace handling routine (usually part of a soft- 
ware debug monitor). Further program execution is 
halted until the trace routine is completed. When the 
trace event handling routine is completed, instruc- 
tion execution resumes at the next instruction. The 
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80960KA's tracing mechanisms, which are imple- 
mented completely in hardware, greatly simplify the 
task of testing and debugging software. 



FAULT DETECTION 

The 80960KA has an automatic mechanism to 
handle faults. There are ten fault types including 
trace, arithmetic, and floating-point faults. When the 
processor detects a fault, it automatically calls the 
appropriate fault handling routine and saves the cur- 
rent instruction pointer and necessary state informa- 
tion to make efficient recovery possible. The proces- 
sor posts diagnostic information on the type of fault 
to a Fault Record. Like interrupt handling routines, 
fault handling routines are usually written to meet 
the needs of a specific application and are often in- 
cluded as part of the operating system or kernel. 

For each of the ten fault types, there are numerous 
subtypes that provide specific information about a 
fault. For example, a floating-point fault may have its 
subtype set to an Overflow or Zero-Divide fault. The 
fault handler can use this specific information to re- 
spond correctly to the fault. 



BUILT-IN TESTABILITY 

Upon reset, the 80960KA automatically conducts an 
extensive internal test (self-test) of its major blocks 



of logic. Then, before executing its first instruction, it 
does a zero check sum on the first eight words in 
memory to ensure that the system has been loaded 
correctly. If a problem is discovered at any point dur- 
ing the self-test, the 80960KA will assert its FAIL- 
URE pin and will not begin program execution. The 
self-test takes approximately 47,000 cycles to com- 
plete. 

System manufacturers can use the 80960KA's self- 
test feature during incoming parts inspection. No 
special diagnostic programs need to be written, and 
the test is both thorough and fast. The self-test ca- 
pability helps ensure that defective parts will be dis- 
covered before systems are shipped, and once in 
the field, the self-test makes it easier to distinguish 
between problems caused by processor failure and 
problems resulting from other causes. 



CHMOS 

The 80960KA is fabricated using Intel's CHMOS IV 
(Complementary High Speed Metal Oxide Semicon- 
ductor) process. This advanced technology elimi- 
nates the frequency and reliability limitations of older 
CMOS processes and opens a new era in micro- 
processor performance. It combines the high per- 
formance capabilities of Intel's industry-leading 
HMOS technology with the high density and low 
power characteristics of CMOS. The 80960KA is 
available at 10, 16, 20 and 25 MHz. 



Table 4a. 80960KA Pin Description: L-Bus Signals 



Symbol 


Type 


Name and Function 


CLK2 


" I 


SYSTEM CLOCK provides the fundamental timing for 80960KA systems. It is 
divided by two inside the 80960KA to generate the internal processor clock. 


LAD 31 
-LAD 


I/O 
T.S. 


LOCAL ADDRESS/DATA BUS carries 32-bit physical addresses and data to and 
from memory. During an address (T a ) cycle, bits 2-31 contain a physical word 
address (bits 0-1 indicate SIZE; see below). During a data (Td) cycle, bits 0-31 
contain read or write data. The LAD lines are active HIGH and float to a high 
impedance state when not active. 

SIZE, which is comprised of bits 0-1 of the LAD lines during a T a cycle, specifies 
the size of a burst transfer in words. 

LADj LAD 

1 Word 

1 2Words 

1 3 Words 
1 1 4 Words 


ALE 


O 
T.S. 


ADDRESS-LATCH ENABLE indicates the transfer of a physical address. ALE is 
asserted during a T a cycle and deasserted before the beginning of the Td state. It 
is active LOW and floats to a high impedance state during a hold cycle OVi or Thr)- 



I/O = Input/Output, O = Output, I = Input, O.D. = Open-Drain, T.S. = tri-state 
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Table 4a. 80960KA Pin Description: L-Bus Signals (Continued) 



Symbol 


Type 


Name and Function 


ADS 



O.D. 


ADDRESS/DATA STATUS indicates an address state. ADS is asserted every T a 
state and deasserted during the following T<j state. For a burst transaction, ADS is 
asserted again every Td state where READY was asserted in the previous cycle. 


W/R 



O.D. 


WRITE/READ specifies, during a T a cycle, whether the operation is a write or 
read. It is latched on-chip and remains valid during Td cycles. 


DT/R 



O.D. 


DATA TRANSMIT/RECEIVE indicates the direction of data transfer to and from 
the L-Bus. It is low during T a and Td cycles for a read or interrupt 
acknowledgement; it is high during T a and T<j cycles for a write. DT/R never 
changes state when DEN is asserted (see Timing Diagrams). 


DEN 



O.D. 


DATA ENABLE is asserted during Td cycles and indicates transfer of data on the 
LAD bus lines. 


READY 


1 


READY indicates that data on LAD lines can be sampled or removed. If READY is 
not asserted during a Td cycle, the_Jd cycle is extended to the next cycle by 
inserting a wait state (Tw), and ADS is not asserted in the next cycle. 


LOCK 


I/O 
O.D. 


BUS LOCK prevents other bus masters from gaining control of the L-Bus 
following the current cycle (if they would assert LOCK to do so). LOCK is used by 
the processor or any bus agent when it performs indivisible Read/ Modify/ Write 
(RMW) operations. Do not leave LOCK unconnected. It must be pulled high for the 
processor to function properly. 

For a read that is designated as a RMW-read, LOCK is examined, if asserted, the 
processor waits until it is not asserted; if not asserted, the processor asserts 
LOCK during the T a cycle and leaves it asserted. 

A write that is designated as an RMW-write deasserts LOCK in the T a cycle. 
During the time LOCK is asserted, a bus agent can perform a normal read or write 
but no RMW operations. LOCK is also held asserted during an interrupt- 
acknowledge transaction. 


BEi-BE^ 



O.D. 


BYTE ENABLE LINES specify which data bytes (up to four) on the bus take part 
in the current bus cycle. BE3 corresponds to LAD31 -LAD24 and BEo corresponds 
toLAD 7 -LAD . 

The byte enables are provided in advance of data. The byte enables asserted 
during T a specify the bytes of the first data word. The byte enables asserted 
during Td specify the bytes of the next data word (if any), that is, the word to be 
transmitted following the next assertion of READY. The byte enables during the 
Td cycles preceding the last assertion of READY are undefined. The byte enables 
are latched on-chip and remain constant from one Td cycle to the next when 
READY is not asserted. 

For reads, the byte enables specify the byte(s) that the processor will actually use. 
L-Bus agents are required to assert only adjacent byte enables (e.g., asserting just 
BEo an d BE2 is not permitted), and are required to assert at least one byte enable. 
To produce address bits Aq and A1 externally, they can be decoded from the byte 
enables. 



I/O = Input/Output, O = Output, I = Input, O.D. = Open-Drain, T.S. = tri-state 
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Table 4a. 80960KA Pin Description: L-Bus Signals (Continued) 



Symbol 


Type 


Name and Function 


HOLD/ 
HLDAR 


1 


HOLD: If the processor is the primary bus master (PBM), the input is interpreted 
as HOLD, a request from a secondary bus master to acquire the bus. When the 
processor receives HOLD and grants another master control of the bus, it floats 
its tri-state bus lines and then asserts HLDA and enters the Th state. When HOLD 
is deasserted, the processor will deassert HLDA and go to either the Tj or T a 
state. 

HOLD ACKNOWLEDGE RECEIVED: If the processor is a secondary bus master 
(SBM), the input is HLDAR, which indicates, when HOLDR output is high, that the 
processor has acquired the bus. Processors and other agents can be told at reset 
if they are the primary bus master (PBM). 


HLDA/ 
HOLDR 



T.S. 


HOLD ACKNOWLEDGE: If the processor is a primary bus master, the output is 
HLDA, which relinquishes control of the bus to another bus master. 

HOLD REQUEST: For secondary bus masters (SBM), the output is HOLDR, which 
is a request to acquire the bus. The bus is said to be acquired if the agent is a 
primary bus master and does not have its HLDA output asserted, or if the agent is 
a secondary bus master and has its HOLD input and HLDA output asserted. 


CACHE 



T.S. 


CACHE indicates if an access is cacheable during a T a cycle. It is not asserted 
during any synchronous access, such as a synchronous load or move instruction 
used for sending an IAC message. The CACHE signal floats to a high impedance 
state when the processor is idle. 



Table 4b. 80960KA Pin Description: Module Support Signals 



Symbol 


Type 


Name and Function 




1 




BADAC 


BAD ACCESS, if asserted in the cycle following the one in which the last READY 
of a transaction is asserted, indicates that an unrecoverable error has occurred on 
the current bus transaction, or that a synchronous load/store instruction has not 
been acknowledged. 


STARTUP: During system reset, the BADAC signal is interpreted differently. If the 
signal is high, it indicates that this processor will perform system initialization. If it 
is low, another processor in the system will perform system initialization instead. 


RESET 


1 


RESET clears the internal logic of the processor and causes it to re-initialize. 


During RESET assertion, the input pins are ignored (except for BADAC and 
IAC/INTq), the tri-state output pins are placed in a high impedance state, and 
other output pins are placed in their non-asserted state. 

RESET must be asserted for at least 41 CLK2 cycles for a predictable RESET. 
The HIGH to LOW transition of RESET should occur after the rising edge of both 
CLK2 and the external bus CLK, and before the next rising edge of CLK2. 





O.D. 


INITIALIZATION FAILURE indicates that the processor has failed to initialize 
correctly. After RESET is deasserted and before the first bus transaction begins, 
FAILURE is asserted while the processor performs a self-test. If the self-test 
completes successfully, then FAILURE is deasserted. Next, the processor 


FAILURE 


performs a zero checksum on the first eight words of memory. If it fails, FAILURE 
is asserted for a second time and remains asserted; if it passes, system 


initialization continues and FAILURE remains deasserted. 


N.C. 


N/A 


NOT CONNECTED indicates pins should not be connected. Never connect any 
pin marked N.C. 



I/O = Input/Output, O = Output, I = Input, O.D. = Open-Drain, T.S. = tri-state 
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Table 4b. 80960KA Pin Description: Module Support Signals (Continued) 



Symbol 


Type 


Name and Function 


JAC 
INTO 


I 


INTERAGENT COMMUNICATION REQUEST/INTERRUPT indicates either 
that there is a pending IAC message for the processor or an interrupt. The bus 
interrupt control register determines in which way the signal should be interpreted. 
To signal an interrupt or IAC request in a synchronous system, this pin (as well as 
the other interrupt pins) must be enabled by being deasserted for at least one bus 
cycle and then asserted for at least one additional bus cycle; in an asynchronous 
system, the pin must remain deasserted for at least two bus cycles and then be 
asserted for at least two more bus cycles. 

LOCAL PROCESSOR NUMBER: This signal is interpreted differently during 
system reset. If the signal is at a high voltage level, it indicates that this processor 
is a primary bus master (Local Processor Number = 0); if it is at a low voltage 
level, it indicates that this processor is a secondary bus master (Local Processor 
Number =1). 


INT1 


I 


INTERRUPT 1, like INTO, provides direct interrupt signaling. 


INT2/ 
INTR 


I 


INTERRUPT 2/INTERRUPT REQUEST: The bus control registers determines 
how this pin is interpreted. If INT2, it has the same interpretation as the INTO and 
INT1 pins. If INTR, it is used to receive an interrupt request from an external 
interrupt controller. 


INT5/ 
INTA 


I/O 
O.D. 


INTERRUPT 3/INTERRUPT ACKNOWLEDGE: The bus interrupt control register 
determines how this pin is interpreted. If INT3, it has the same interpretation as 
the INTO, INT1 , and INT2 pins. If INTA, it is used as an output to control interrupt- 
acknowledge bus transactions. The INTA output is latched on-chip and remains 
valid during Td cycles; as an output, it is open-drain. 




I/O = Input/Output, O = Output, I = Input, O.D. = Open-Drain, T.S. = tri-state 



ELECTRICAL SPECIFICATIONS 



Power and Grounding 

The 80960KA is implemented in CHMOS IV technol- 
ogy and has modest power requirements. Its high 
clock frequency and numerous output buffers (ad- 
dress/data, control, error, and arbitration signals) 
can cause power surges as multiple output buffers 
drive new signal levels simultaneously. For clean on- 
chip power distribution at high frequency, 12 Vcc 
and 1 3 Vss pins separately feed functional units of 
the 80960KA in the PGA. 

Power and ground connections must be made to all 
power and ground pins of the 80960KA. On the cir- 
cuit board, all Vcc pins must be strapped closely 
together, preferably on a power plane. Likewise, all 
Vss Pins should be strapped together, preferably on 
a ground plane. These pins may not be connected 
together within the chip. 



Power Decoupling Recommendations 

Liberal decoupling capacitance should be placed 
near the 80960KA. The processor can cause tran- 
sient power surges when driving the L-Bus, particu- 
larly when it is connected to a large capacitive load. 

Low inductance capacitors and interconnects are 
recommended for best high frequency electrical per- 
formance. Inductance can be reduced by shortening 
the board traces between the processor and de- 
coupling capacitors as much as possible. Capacitors 
specifically designed for PGA packages are also 
commercially available and offer the lowest possible 
inductance. 



Connection Recommendations 

For reliable operation, always connect unused in- 
puts to an appropriate signal level. In particular, if 
one or more interrupt lines are not used, they should 
be pulled up. No inputs should ever be left floating. 
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All open-drain outputs require a pullup device. While 
in some cases a simple pullup resistor will be ade- 
quate, we recommend a network of pullup and pull- 
down resistors biased to a valid Vm (>3.4V) and 
terminated in the characteristic impedahce of the cir- 
cuit board. Figure 6 shows our recommendations for 
the resistor values for both a low and high current 
drive network, which assumes that the circuit board 
has a characteristic impedance of 10011. The advan- 
tage of terminating the output signals in this fashion 
is that it limits signal swing and reduces AC power 
consumption. 



Characteristic Curves 

Figure 7 shows the typical supply current require- 
ments over the operating temperature range of the 
processor at supply voltage (Vcc) of 5V. Figure 8 
shows the typical power supply current (Ice) re- 
quired by the 80960KA at various operating frequen- 
cies when measured at three input voltage (Vcc) 
levels. 

For a given output current (Iol). the curve in Figure 9 
shows the worst case output low voltage (Vol)- 



Figure 10 shows the typical capacitive derating 
curve for the 80960KA measured from 1.5V on the 
system clock (CLK) to 1 .5V on the falling edge and 
1 .5V on the rising edge of the L-Bus address/data 
(LAD) signals. 



Test Load Circuit 

Figure 13 illustrates the load circuit used to test the 
80960KA's tristate pins, and Figure 14 shows the 
load circuit used to test the open drain outputs. The 
open drain test uses an active load circuit in the form 
of a matched diode bridge. Since the open-drain out- 
puts sink current, only the Iql '©Qs of the bridge are 
necessary and the Iqh ' e 9 s are not used. When the 
80960KA driver under test is turned off, the output 
pin is pulled up to Vref O'-e., Vqh)- Diode D\ is 
turned off and the Iol current source flows through 
diode D2. 

When the 80960KA open-drain driver under test is 
on, diode D1 is also on, and the voltage on the pin 
being tested drops to Vol- Diode D2 turns off and 
Iql flows through diode D-|. 
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Vcc 






> 180X1 
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OPEN-DRAIN f- 

OUTPUT 
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J 
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> 280X1 


Low Drive Network: 

• Vqh = 3.42V 

• Iql = 25.3 mA 
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High Drive Network: 

• V H = 3.41V 

• Iql = 33.8 mA 
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Figure 6. Connection Recommendations for Low and High Current Drive Networks 
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Figure 7. Typical Supply Current (Ice) 
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Figure 8. Typical Current vs Frequency 
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Figure 9. Worst Case Voltage vs 
Output Current on Open-Drain Pins 



Figure 10. Capacitive Derating Curve 
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ABSOLUTE MAXIMUM RATINGS" 

Operating Temperature 0°C to + 85°C Case 

Storage Temperature -65°C to + 1 50°C 

Voltage on Any Pin -0.5V to V<x + 0.5V 

Power Dissipation 2.5W (25 MHz) 



NOTICE: This is a production data sheet. The specifi- 
cations are subject to change without notice. 



* WARNING: Stressing the device beyond the "Absolute 
Maximum Ratings" may cause permanent damage. 
These are stress ratings only. Operation beyond the 
"Operating Conditions" is not recommended and ex- 
tended exposure beyond the "Operating Conditions" 
may affect device reliability. 



DC CHARACTERISTICS 



PGA: 



80960KA (16 MHz): T C ASE = 0°C to +85°C, V C c = 5V ±10% 
80960KA (20 and 25 MHz): T C ASE = 0°C to + 85°C, V C c = 5V ±5% 



PQFP: 

80960KA(10 
80960KA (20 



and 16 MHz): T C ase = 0°Cto + 100°C, V C c = 5V ±10% 
MHz):T C ASE = 0°C to +100°C l V C c = 5V ±5% 



Symbol 


Parameter 


Min 


Max 


Units 


Test Conditions 


V|L 


Input Low Voltage 


-0.3 


+ 0.8 


V 




V| H 


Input High Voltage 


2.0 


Vcc + 0.3 


V 




V CL 


CLK2 Input Low Voltage 


-0.3 


+ 0.8 


V 




V C H 


CLK2 Input High Voltage 


0.55 V CC 


V CC + 0.3 


V 




Vol 


Output Low Voltage 




0.45 


V 


'(1,5) 


VOH 


Output High Voltage 


2.4 




V 


(2,4) 


ice 


Power Supply Current: 
10 MHz 
16 MHz 
20 MHz 
25 MHz 




300 
375 
420 
480 


mA 
mA 
mA 
mA 




Ili 


Input Leakage Current 




±15 


juA 


o <; V| N <; v C c 


Ilo 


Output Leakage Current 




±15 


julA 


0.45 £ V £ Vcc 


C|N 


Input Capacitance 




10 


PF 


f c = 1 MHz(3) 


c 


I/O or Output Capacitance 




12 


PF 


f c ■= 1 MHz(3) 


CcLK 


Clock Capacitance 




10 


PF 


f c = 1 MHzO) 



NOTES: 

1 . For tri-state outputs, this parameter is measured at: 

Address/ Data 

Controls 

2. This parameter is measured at: 

Address/Data 

Controls . . 

ALE 

3. Input, output, and clock capacitance are not tested. 

4. Not measured on open-drain outputs. 

5. For open-drain outputs 



.4.0 mA 
.5.0 mA 



.-1.0 mA 
.-0.9mA 
.-5.0 mA 



.25 mA 



3-48 



iny« 



80960KA 



AC SPECIFICATIONS 

This section describes the AC specifications for the 
80960KA pins. All input and output timings are spec- 
ified relative to the 1.5V level of the rising edge. Four 
output timings, the specifications refer to the time it 
takes the signal to reach 1.5V. For input timings, 



the specifications refer to the time at which the sig- 
nal reaches (for input setup) or leaves (for hold time) 
the TTL levels of LOW (0.8V) or HIGH (2.0V). All AC 
testing should be done with input clock voltages of 
0.4V and 2.4V, except for the clock (CLK2), which 
should be tested with input voltages of 0.45 Vcc and 
0.55 V CC . 



OUTPUTS: 
LAD 31 -LAD , 

ADS, 

W/R.DEN, 
BE 3 -BE 
HLDA/H0LDR, 

CACH E 

L0CK.INTA 



DT/R 



INPUTS: 

LAD^-LADq, 
BADAC, 
IAC/INTq.INTj, 
INT 2 /INTR,INT 3 



HOLD.HLDAR, 

LOCK, 

READY 





Figure 1 1. Drive Levels and Timing Relationships for 80960KA Signals 
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Figure 12. Timing Relationship of L-Bus Signals 
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AC Specification Tables 

80960KA AC Characteristics (10 MHz, PQFP Only) 


Symbol 


Parameter 


Min 


Max 


Units 


Test Conditions 


Ti 


Processor Clock 
Period (CLK2) 


50 


125 


ns 


V| N = 1.5V 


T 2 


Processor Clock 
Low Time (CLK2) 


12 




ns 


V| L = 10% Point 
= 1.2V 


T 3 


Processor Clock 
High Time (CLK2) 


12 




ns 


V| H = 90% Point 

= 0.1V + 0.5 V C c 


T 4 


Processor Clock 
Fall Time (CLK2) 




10 


ns 


Vin = 90% Point to 10% 
Point 


T 5 


Processor Clock 
Rise Time (CLK2) 




10 


ns 


V| N = 10% Point to 90% 
Point 


T 6 


Output Valid 
Delay 


2 


25 


ns 


C L = 100pF(LAD) 

C L = 75 pF (Controls)(2) 


T6H 


HOLDA Output 
Valid Delay 


4 


31 


ns 


C L = 75 pF 


T 7 


ALE Width 


25 




ns 


C L = 75 pF 


T 8 


ALE Output Valid Delay 





20 


ns 


C L = 75 pF(2) 


T 9 


Output Float 
Delay 


2 


20 


ns 


C L = 100 pF (LAD) 
C L = 75 pF (Controls) 


TgH 


HOLDA Output 
Float Delay 


4 


20 


ns 


C L = 75 pF 


T10 


Input Setup 1 


3 




ns 




T11 


Input Hold 


5 




ns 




Thh 


HOLD Input 
Hold 


4 




ns 




T12 


Input Setup 2 


8 




ns 




T13 


Setup to ALE 
Inactive 


10 




ns 


C L = 100pF(LAD) 
C L = 75 pF (Controls) 


T14 


Hold after ALE 
Inactive 


8 




ns 


C L = 100 pF (LAD) 
C L = 75 pF (Controls) 


T15 


Reset Hold 


3 




ns 




Tie 


Reset Setup 


5 




ns 




T17 


Reset Width 


1640 




ns 


41 CLK2 Periods Minimum 




NOTES: 

1. lAC/INTo, INT-,, INT 2 /INTR, INT 3 can be asynchronous. 

2. A float condition occurs when the maximum output current becomes less than Ilo Float delay is not tested, but should be 
no longer than the valid delay. 

3. Clock rise and fall time is not tested. 
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80960KA AC Characteristics (16 MHz) 


Symbol 


Parameter 


Min 


Max 


Units 


Test Conditions 


Ti 


Processor Clock 
Period (CLK2) 


31.25 


125 


ns 


V| N = 1.5V 


T 2 


Processor Clock 
Low Time (CLK2) 


8 




ns 


V| L = 10% Point 
= 1.2V 


T 3 


Processor Clock 
High Time (CLK2) 


8 




ns 


V| H = 90% Point 

= 0.1V + 0.5 V C c 


T 4 


Processor Clock 
Fall Time (CLK2) 




10 


ns 


V| N = 90% Point to 10% 
Point 


T 5 


Processor Clock 
Rise Time (CLK2) 




10 


ns 


V| N = 10% Point to 90% 
Point 


T 6 


Output Valid 
Delay 


2 


25 


ns 


C L = 100pF(LAD) 
C L = 75 pF (Controls) 


T 6H 


HOLDA Output 
Valid Delay 


4 


31 


ns 


C L = 75 pF 


T 7 


ALE Width 


15 




ns 


C L = 75 pF 


T 8 


ALE Output Valid Delay 





20 


ns 


C L = 75 pF(2) 


T 9 


Output Float 
Delay 


2 


20 


ns 


C L = 100 pF (LAD) 

C L = 75 pF (Controls)(2) 


Tqh 


HOLDA Output 
Float Delay 


4 


20 


ns 


C L = 75 pF 


T10 


Input Setup 1 


3 




ns 




T11 


Input Hold 


5 




ns 




Thh 


HOLD Input 
Hold 


4 




ns 




T12 


Input Setup 2 


8 




ns 




T13 


Setup to ALE 
Inactive 


10 




ns 


C L = 100 pF (LAD) 
C L = 75 pF (Controls) 


T14 


Hold after ALE 
Inactive 


8 




ns 


C L = 100pF(LAD) 
C L = 75 pF (Controls) 


T15 


Reset Hold 


3 




ns 




T16 


Reset Setup 


5 




ns 




T17 


Reset Width 


1281 




ns 


41 CLK2 Periods Minimum 



NOTES: 

1. IAC/INTq, INT 1t INT 2 /INTR, INT3 can be asynchronous. 

2. A float condition occurs when the maximum output current becomes less than Ilo- Float delay is not tested, but should be 
no longer than the valid delay. 

3. Clock rise and fall time is not tested. 
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80960KA AC Characteristics (20 MHz) 


Symbol 


Parameter 


Min 


Max 


Units 


Test Conditions 


Ti 


Processor Clock 
Period (CLK2) 


25 


125 


ns 


V, N = 1.5V 


T 2 


Processor Clock 
Low Time (CLK2) 


6 




ns 


V| L = 10% Point 
= 1.2V 


T 3 


Processor Clock 
High Time (CLK2) 


6 




ns 


V| H = 90% Point 
= 0.1V + 0.5 V C c 


T 4 


Processor Clock 
Fall Time (CLK2) 




10 


ns 


V| N = 90% Point to 10% 
Point 


T 5 


Processor Clock 
Rise Time (CLK2) 




10 


ns 


Vin = 10% Point to 90% 
Point 


T 6 


Output Valid 
Delay 


2 


20 


ns 


C L = 60 pF (LAD) 
C L = 50 pF (Controls) 


T6H 


HOLDA Output 
Valid Delay 


4 


26 


ns 


C L = 50 pF 


Ty 


ALE Width 


12 




ns 


C L = 50 pF 


T 8 


ALE Output Valid Delay 





20 


ns 


C L = 50 pF(2) 


T 9 


Output Float 
Delay 


2 


20 


ns 


C L = 60 pF (LAD) 

C L = 50 pF (Controls)(2) 


T 9H 


HOLDA Output 
Float Delay 


4 


20 


ns 


C L = 50 pF 


T10 


Input Setup 1 


3 




ns 




T11 


Input Hold 


5 




ns 




Thh 


HOLD Input 
Hold 


4 




ns 




T12 


Input Setup 2 


7 




ns 




T13 


Setup to ALE 
Inactive 


10 




ns 


C L = 60 pF (LAD) 
C L = 50 pF (Controls) 


T14 


Hold after ALE 
Inactive 


8 




ns 


C L = 60 pF (LAD) 
C L = 50 pF (Controls) 


T15 


Reset Hold 


3 




ns 




T16 


Reset Setup 


5 




ns 




T17 


Reset Width 


1025 




ns 


41 CLK2 Periods Minimum 



NOTES: 

1. IAC/INTq, INTl INT 2 /INTR, INT3 can be asynchronous. 

2. A float condition occurs when the maximum output current becomes less than I|_q. Float delay is not tested, but should be 
no longer than the valid delay. 

3. Clock rise and fall time is not tested. 



80960KA 
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80960KA 
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Iol Tested at 25 mA 

Vref = Vcc 

D-i and D2 are matched 270775-13 



Figure 13. Test Load Circuit for 
Tri-State Output Pins 



Figure 14. Test Load Circuit for Open-Drain Output Pins 
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80960KA AC Characteristics (25 MHz, PGA Only) 


Symbol 


Parameter 


Min 


Max 


Units 


Test Conditions 


Ti 


Processor Clock 
Period (CLK2) 


20 


125 


ns 


V| N = 1.5V 


T 2 


Processor Clock 
Low Time (CLK2) 


5 




ns 


V| L = 10% Point 
= 1.2V 


T 3 


Processor Clock 
High Time 


5 




ns 


Vih = 90% Point 

- 0.1V + 0.5 Vcc 


T 4 


Processor Clock 
Fall Time (CLK2) 




10 


ns 


V| N = 90% Point to 10% 
Point 


T 5 


Processor Clock 
Rise Time (CLK2) 




10 


ns 


V| N = 10% Point to 90% 
Point 


T 6 


Output Valid 
Delay 


2 


18 


ns 


C L = 60 pF (LAD) 
C|_ = 50 pF (Controls) 


T6H 


HOLDA Output 
Valid Delay 


4 


24 


ns 


C L = 50 pF 


T 7 


ALE Width 


12 




ns 


C L = 50 pF 


T 8 


ALE Output Valid Delay 





20 


ns 


C L = 50 pF(2) 


Tg 


Output Float 
Delay 


2 


18 


ns 


C L = 60 pF (LAD) 
C L = 50 pF (Controls) 


T 9H 


HOLDA Output 
Float Delay 


4 


20 


ns 


C L = 50 pF . 


T10 


Input Setup 1 


3 




ns 




T11 


Input Hold 


5 




ns 




Thh 


HOLD Input 
Hold 


4 




ns 




T12 


Input Setup 2 


7 




ns 




T13 


Setup to ALE 
Inactive 


8 




ns 


C L = 60 pF (LAD) 
C L = 50 pF (Controls) 


T14 


Hold after ALE 
Inactive 


8 




ns 


C L = 60 pF (LAD) 
C L = 50 pF (Controls) 


Ti5 


Reset Hold 


3 




ns 




Tie 


Reset Setup 


5 




ns 




T17 


Reset Width 


820 




ns 


41 CLK2 Periods Minimum 



NOTES: 

1. IAC/INT0, INT1, INT2/INTR, INT3 can be asynchronous. 

2. A float condition occurs when the maximum output current becomes less than Ilo- Float delay is not tested, but should be 
no longer than the valid delay. 

3. Clock rise and fall time is not tested. 
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Figure 15. Processor Clock Pulse (CLK2) 
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Figure 16. RESET Signal Timing 
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Figure 17. Hold Timing 



Design Considerations 



Input hold times can be disregarded by the designer 
whenever the input is removed because a subse- 
quent output from the processor is deasserted (e.g., 
DEN becomes deasserted). 

Whenever the processor generates an output that 
indicates a transition into a subsequent state, any 
outputs that are specified to be tri-stated in this new 
state are guaranteed to be tri-stated. For example, in 
the Td cycle following a T a cycle for a read, the mini- 
mum output delay of DEN is 2 ns, but th e maximum 
float time of LAD is 20 ns. When DEN is asserted, 
however, the LAD outputs are guaranteed to have 
been tri-stated. 



Designing for the ICE-960KB 

The 80960KB In-Circuit Emulator assists in debug- 
ging both 80960KA and 80960KB hardware and 
software designs. The product consists of a probe 
module, cable, and control unit. Because of the high 
operating frequency of 80960KA systems, the probe 
module connects directly to the 80960KA socket. 



When designing an 80960KA hardware system that 
uses the ICE-960KB to debug the system, several 
electrical and mechanical characteristics should be 
considered. These considerations include capacitive 
loading, drive requirement, power requirement and 
physical layout. 

The ICE-960KB probe module increases the load 
capacitance of each line by up to 25 pF. It also adds 
one standard Schottky TTL load on the CLK2 line, 
up to one advanced low-power Schottky TTL load 
for each control signal line, and one advanced low- 
power Schottky TTL load for each address/data and 
byte enable line. These loads originate from the 
probe module and are driven by the 80960KA proc- 
essor. 

To achieve high noise immunity, the ICE-960KB 
probe is powered by the user's system. The high- 
speed probe circuitry draws up to 1 .1 A plus the max- 
imum current (Ice) of the 80960KA processor. 

The mechanical considerations are shown in Figure 
18, which illustrates the lateral clearance require- 
ments for the ICE-960KB probe as viewed from 
above the socket of the 80960KA processor. 
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Figure 18. ICE-960KB Lateral Clearance Requirements 



MECHANICAL DATA 



Package Dimensions and Mounting 

The 80960KA is available in two different packages: 
a 132-lead ceramic pin-grid array (PGA) and a 132- 
lead plastic quad flat pack (PQFP). Pins in the ce- 
ramic package are arranged 0.100 inch (2.54 mm) 
center-to-center, in a 14 by 14 matrix, three rows 
around. (See Figure 19.) The plastic package uses 
fine-pitch gull wing leads arranged in a single row 
along the perimeter of the package with 0.025 inch 
(0.64 mm) spacing. (See Figure 20.) Dimensions are 
given in Figure 21 and Table 7. 

There are a wide variety of sockets available for the 
ceramic PGA package including low-insertion or 
zero-insertion force mountings, and a choice of ter- 
minals such as soldertailj surface mount, or wire 
wrap. Several applicable sockets are shown in Fig- 
ure 22. 

The PQFP is normally surface mounted to take best 
advantage of the plastic package's small footprint 
and low cost. In some applications, however, de- 
signers may prefer to use a socket, either to improve 



heat dissipation or reduce repair costs. Figures 23a 
and 23b show two of the many sockets available. 



Pin Assignment 

The PGA and PQFP have different pin assignments. 
Figure 24 shows the view from the bottom of the 
PGA (pins facing up) and Figure 25 shows a view 
from the top of the PGA (pins facing down). Figures 
20 and 32 show the top view of the PQFP; notice 
that the pins are numbered in order from 1 to 132 
around the package's perimeter. Tables 5 and 6 list 
the function of each pin in the PGA, and Tables 8 
and 9 list the function of each pin in the PQFP. 

Vcc and GND connections must be made to multi- 
ple Vcc and GND pins. Each Vcc and GND pin must 
be connected to the appropriate voltage or ground 
and externally strapped close to the package. We 
recommend that you include separate power and 
ground planes in your circuit board for power distri- 
bution. 

NOTE: 

Pins identified as N.C., "No Connect," should never 
be connected. 
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Package Thermal Specification 

The 80960KA Is specified for operation when case 
temperature is within the range 0°C to + 85°C (PGA) 
or + 100°C (PQFP). The case temperature should 
be measured at the top center of the package as 
shown in Figure 26. 

The ambient temperature can be calculated from 0j C 
and 0j a by using the following equations: 

Tj = T C + P*0 jc 
T A = Tj-P*0ja 
Tc = T A + P*[0 ]a - 9 ]c ] 

Values for ]a and 0j C are given in Table 1 for the 
PGA package and in Table 1 1 for the PQFP for vari- 
ous airflows. Note that the 0j a for the PGA package 
can be reduced by adding a heatsink, while a heat- 
sink is not generally used with the plastic package 
since it is intended to be surface mounted. The max- 
imum allowable ambient temperature (T A ) permitted 
without exceeding Tc is shown by the charts in Fig- 
ures 27 through 30 for 10 MHz, 16 MHz, 20 MHz, 
and 25 MHz respectively. 

The curves assume the maximum permitted supply 
current (Ice) at each speed, Vcc of 5.0V, and a 
TcASE of + 85°C (PGA) or + 1 00°C (PQFP). 

If you will be using the 80960KA in a harsh environ- 
ment where the ambient temperature may exceed 
the limits for the normal commercial part, you should 
consider using an extended temperature part. These 
parts are designed by the prefix "TA" and are avail- 
able at 16 MHz, 20 MHz and 25 MHz in the ceramic 
PGA package. The extended operating temperature 
range is -40°C to +125°C case. Figure 30 shows 
the maximum allowable ambient temperature for the 
20 MHz extended temperature TA80960KA at vari- 
ous airflows. The curve assumes an Ice of 420 mA, 
Vcc of 5.0V, and a T CA SE of + 125°C. 



WAVEFORMS 

Figures 33 through 38 show the waveforms for vari- 
ous transactions on the 80960KA's local bus. 



SUPPORT COMPONENTS 



85C960 Burst Bus Controller 

The Intel 85C960 performs burst logic, ready gener- 
ation, and address decode for the 80960KA and 
80960KB. The burst logic supports both standard 
and burst mode memories and peripherals. The 
ready generation and timing control supports to 1 5 
wait states across eight address ranges for read/ 
write and burst accesses. The address decoder de- 
codes eight address inputs into four external and 
four internal chip selects. The wait state and chip 
select values may be programmed by the user; the 
timing control and burst logic are fixed. 

The 85C960 operates with the 80960KA and 
80960KB at all frequencies and consumes only 
50 mA at 25 MHz. The 85C960 is housed in a 28-pin, 
300-mil ceramic DIP and plastic DIP packages or 28- 
pin PLCC package for surface mount. In the ceramic 
DIP package the part is UV-erasable, which makes it 
easy to revise designs. Order the 85C960 data sheet 
(No. 290192) for full details. 



27960KX Burst Mode EPROM 

Intel 27960KX one-megabit EPROM is designed 
specifically to support the 80960KA and 80960KB. It 
uses a burst interface to offer hear zero wait-state 
performance without the high cost of alternative 
memory technologies. The 27960KX removes the 
need for "dumping" code and data stored in slow 
EPROMs or ROMs into expensive high-speed 
"shadow" RAM. 

Internally, the 27960KX is organized in blocks of four 
bytes that are accessed sequentially. The address 
of the four-byte block is latched and incremented 
internally. After a set number of wait-states (1 or 2), 
data is output one word at a time each subsequent 
clock cycle. High-performance outputs provide zero 
wait-state data-to-data burst accesses. Extra power 
and ground pins dedicated to the output reduce the 
effect of fast output switching on the device. The 
27960KX offers 1-0-0-0 performance at 20 MHz and 
2-0-0-0 performance at 25 MHz. Full details can be 
found in the 27960KX data sheet (No. 290237) 
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Figure 19. A 132-Lead Pin-Grid Array (PGA) Used to Package the 80960KA 
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Figure 20. The 132-Lead Plastic Quad Flat Pack (PQFP) used to Package the 80960KA 
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Figure 21a. Principal Dimensions of the 132-Lead PQFP 
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Figure 21b. Details of the Molding of the 132-Lead PQFP 
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Figure 21c. Terminal Details for the 132-Lead PQFP 
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Figure 21d. Board Footprint Area for the 132-Lead PQFP 
Table 7. Package Dimension: 80960KA PQFP 



Symbol 


Description 


Inches 


MM 


Min 


Max 


Min 


Max 


N 


Leadcount 


132 Leads 


132 Leads 


A 


Package Height 


0.160 


0.170 


4.060 


4.320 


A1 


Standoff 


. 0.020 


0.030 


0.510 


0.760 


D,E 


Terminal Dimension 


1.075 


1.085 


27.310 


27.560 


D1.E1 


Package Body 


0.947 


0.953 


24.050 


24.210 


D2.E2 


Bumper Distance 
Without Flash 
With Flash 


1.097 
1.097 


1.103 
1.110 


27.860 
27.860 


28.010 
28.190 


D3.E3 


Lead Dimension 


0.800 REF 


20.32 REF 


D4.E4 


Foot Radius Location 


1.023 


1.037 


25.890 


26.330 


L1 


Foot Length 


0.020 


0.030 


0.510 


0.760 
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• Low insertion force (LIF) soldertail 
55274-1 

• Amp tests indicate 50% reduction in 
insertion force compared to 
machined sockets 

Other socket options 

• Zero insertion force (ZIF) soldertail 
55583-1 

• Zero insertion force (ZIF) Burn-in 
version 55573-2 

Amp Incorporated 
(Harrisburg, PA 17105 U.S.A. 
Phone 717-564-0100) 



>§^\ 55274=1 





Cam handle locks in low profile position when 80960KA is installed 
(handle UP for open and DOWN for closed positions). 

Courtesy Amp Incorporated 



Peel-A-Way* Mylar and Kapton 
Socket Terminal Carriers 

• Low insertion force surface 
mount CS132-37TG 

• Low insertion force soldertail 
CS1 32-01 TG 

• Low insertion force wire-wrap 
CS132-02TG (two-level) 
CS132-03TG (thee-level) 

« Low insertion force press-fit 
CS132-05TG 

Advanced Interconnections 

(5 Division Street) 
Warwick, Rl 02818 U.S.A. 
Phone 401-885-0485) 



Peel-A-Way Carrier No. 132: 
Kapton Carrier is KS132 
Mylar Carrier is MS132 

Molded Plastic Body KS132 
is shown below: 



FOOT PRINT NO. 132 




HK100TYP 
14 x 14 x 3 ROWS 



SOLDER TAIL -01 




WIRE WRAP -02/-03 SOLDER TAIL -33 SURFACE MOUNTING -37 




LOW PROFILE -04 




PRESS FIT -05 




r\ 
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Courtesy Advanced Interconnections 

(Peel-A-Way Terminal Carriers 

U.S. Patent No. 4442938) 



* Peel-A-Way is a trademark of Advanced Interconnections. 



Figure 22. Several Socket Options for Mounting the 80960KA 
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Figure 23a. AMP Micropitch Socket for the 132-Lead Plastic 
Quad Flat Pack, 0.025" Lead Spacing, Gull Wing Leads 
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Part Number: 

2-01 32-07244-000-01 807 




Figure 23b. 3M Company PQFP Socket and Lid 
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Figure 24. 80960KA PGA Pinout— View from Bottom (Pins Facing Up) 
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Figure 25. 80960KA PGA Pinout— View from Top (Pins Facing Down) 
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Table 5. 80960KA PGA Pinout— In Pin Order 


Pin 


Signal 


Pin 


Signal 


Pin 


Signal 


Pin 


Signal 


A1 


Vcc 


C6 


LAD 20 


H1 


W/R 


M10 


Vss 


A2 


Vss 


C7 


LAD 13 


H2 


BE 


M11 


Vcc 


A3 


LAD 19 


C8 


LAD 8 


H3 




M12 


N.C. 


LOCK 


A4 


LAD 17 


C9 


LAD 3 


H12 


N.C. 


M13 


N.C. 


A5 


LAD 16 


C10 


Vcc 


H13 


N.C. 


M14 


N.C. 


A6 


LAD 14 


C11 


Vss 


H14 


N.C. 


N1 


Vss 


A7 


LADu 


C12 


INT3/INTA 


J1 


DT/R 


N2 


N.C. 


A8 


LAD 9 


C13 


INTi 


J2 


BE 2 


N3 


N.C. 


A9 


LAD 7 


C14 


lAC/INTo 


J3 


Vss 


N4 


N.C. 


A10 


LAD 5 


D1 


ALE 


J12 


N.C. 


N5 


N.C. 


A11 


LAD 4 


D2 


ADS 


J13 


N.C. 


N6 


N.C. 


A12 


LADt 


D3 


HLDA/HLDR 


J14 


N.C. 


N7 


N.C. 


A13 


INT 2 /INTR 


D12 


Vcc 


K1 


BE 3 


N8 


N.C. 


A14 


Vcc 


D13 


N.C. 


K2 




N9 


N.C. 


FAILURE 


B1 


LAD 23 


D14 


N.C. 


K3 


v ss 


N10 


N.C. 


B2 


LAD 24 


E1 


LAD 28 


K12 


Vcc 


N11 


N.C. 


B3 


LAD 2 2 


E2 


LAD 26 


K13 


N.C. 


N12 


N.C. 


B4 


LAD21 


E3 


LAD 27 


K14 


> N.C. 


N13 


N.C. 


B5 


LAD 18 


E12 


N.C. 


L1 


DEN 


N14 


N.C. 


B6 


LAD 15 


E13 


v ss 


L2 


N.C. 


P1 


Vcc 


B7 


LAD 12 


E14 


N.C. 


L3 


Vcc 


P2 


N.C. 


B8 


LAD 10 


F1 


LAD 29 


L12 


Vss 


P3 


N.C. 


B9 


LAD 6 


F2 


LAD 31 


L13 


N.C. 


P4 


N.C. 


B10 


LAD 2 


F3 


CACHE 


L14 


N.C. 


P5 


N.C. 


B11 


CLK2 


F12 


N.C. 


M1 


N.C. 


P6 


N.C. 


B12 


LAD 


F13 


N.C. 


M2 


Vcc 


P7 


N.C. 


B13 


RESET 


F14 


N.C. 


M3 


v ss 


P8 


N.C. 


B14 


v S s 


G1 


LAD30 


M4 


Vss 


P9 


N.C. 


C1 


HOLD/HLDAR 


G2 




M5 


Vcc 


P10 


N.C. 


READY 


C2 


LAD 25 


G3 


BEl 


M6 


N.C. 


P11 


N.C. 


C3 




G12 


N.C. 


M7 


N.C. 


P12 


N.C. 


BADAC 


C4 


Vcc 


G13 


N.C. 


M8 


N.C. 


P13 


Vss 


C5 


Vss 


G14 


N.C. 


M9 


N.C. 


P14 


Vcc 
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Table 6. 80960KA PGA Pinout— In Signal Order 


Signal 


Pin 


Signal 


Pin 


Signal 


Pin 


Signal 


Pin 


ADS 


D2 


LAD! 5 


B6 


N.C. 


J14 


N.C. 


P9 


ALE 


D1 


LADie 


A5 


N.C. 


K13 


N.C. 


P10 




C3 


LADi 7 


A4 


N.C. 


K14 


N.C. 


P11 


BADAC 


BE^ 


H2 


LADie 


B5 


N.C. 


L13 


N.C. 


P12 


BET 


G3 


LAD19 


A3 


N.C. 


L14 


N.C. 


L2 


BEi 


J2 


LAD 20 


C6 


N.C. 


M1 




G2 


READY 


BEi 


K1 


LAD 21 


B4 


N.C. 


M6 


RESET 


B13 


CACHE 


F3 


LAD22 


B3 


N.C. 


M7 


Vcc 


A1 


CLK2 


B11 


LAD 23 


B1 


N.C. 


M8 


Vcc 


A14 


DEN 


L1 


LAD 24 


B2 


N.C. 


M9 


v cc 


C4 


DT/R 


J1 


LAD 25 


C2 


N.C. 


M12 


Vcc 


C10 




K2 


LAD 26 


E2 


N.C. 


M13 


Vcc 


D12 


FAILURE 


HLDA/HOLDR 


D3 


LAD 27 


E3 


N.C. 


M14 


Vcc 


K12 


HOLD/HLDAR 


C1 


LAD 28 


E1 


N.C. 


N2 


Vcc 


L3 


IAC/TnTq 
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F1 


N.C. 


N3 


Vcc 


M2 


INT! 


C13 
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G1 


N.C. 


N4 


Vcc 


M5 


INT 2 /INTR 


A13 
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F2 


N.C. 


N5 


Vcc 
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C12 




H3 


N.C. 


N6 


Vcc 


P1 
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B12 
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D13 


N.C. 


N7 


Vcc 


P14 
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N.C. 


N8 


v S s 


A2 


LAD 2 


B10 
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v S s 
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N.C. 
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N.C. 


N13 


v ss 


J3 


LAD 7 


A9 
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N.C. 


N14 
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N.C. 


P2 


Vss 


L12 
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N.C. 


G14 


N.C. 
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N.C. 


P4 


Vss 


M4 


LADu 


A7 


N.C. 


H13 


N.C. 


P5 


Vss 


M10 


LAD! 2 


B7 


N.C. 


H14 


N.C. 


P6 


Vss 


N1 


LAD! 3 


C7 


N.C. 


J12 


N.C. 


P7 


Vss 


P13 


LAD! 4 


A6 


N.C. 


J13 


N.C. 


P8 


W/R 


H1 
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MEASURE PGA CASE TEMPERATURE 
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Figure 26. Measuring 80960KA PGA and PQFP Case Temperature 
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Figure 27. 10 MHz 80960 K-Series Maximum Allowable Ambient Temperature 
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Figure 28. 16 MHz 80960 K-Series Maximum Allowable Ambient Temperature 
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Figure 29. 20 MHz 80960 K-Series Maximum Allowable Ambient Temperature 
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Figure 30. Maximum Allowable Ambient Temperature for 
the 80960KA at 25 MHz (available in PGA only) 
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Figure 31. Maximum Allowable Ambient Temperature for the Extended 
Temperature TA-80960KA at 20 MHz (available in PGA only) 
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NC 
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47 


- 


NC 


LAD 18 




120 


46 




NC 
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121 


45 




NC 


LAD20 




122 


44 
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NC 


LAD21 
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43 
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NC 


LAD22 
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42 
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125 


41 
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v cc 
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NC 
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39 
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NC 
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NC 
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NC 
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NC 
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Figure 32. 80960KA PQFP Pinout— View from Top 
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Table 8. 80960KA Plastic Package Pinout— In Pin Order 


Pin 


Signal 


Pin 


Signal 


Pin 


Signal 


Pin 


Signal 


1 


HLDA/HOLDR 


34 


N.C. 


67 


Vss 


100 


LAD0 


2 


ALE 


35 


Vcc 


68 


Vss 


101 


LAD1 


3 


LAD26 


36 


Vcc 


69 


N.C. 


102 


LAD2 


4 


LAD27 


37 


N.C. 


70 


Vcc 


103 


Vss 


5 


LAD28 


38 


N.C. 


71 


Vcc 


104 


LAD3 


6 


LAD29 


39 


N.C. 


72 


N.C. 


105 


LAD4 


7 


LAD30 


40 


N.C. 


73 


Vss 


106 


LAD5 


8 


LAD31 


41 


Vcc 


74 


Vcc 


107 


LAD6 


9 


Vss 


42 


Vss 


75 


N.C. 


108 


LAD7 


10 


CACHE 


43 


N.C. 


76 


N.C. 


109 


LAD8 


11 


W/R 


44 


N.C. 


71 


N.C. 


110 


LAD9 


12 




45 


N.C. 


78 


N.C. 


111 


LAD10 


READY 


13 


DT/R 


46 


N.C. 


79 


Vss 


112 


LAD11 


14 


BEO 


47 


N.C. 


80 


Vss 


113 


LAD12 


15 


BET 


48 


N.C. 


81 


N.C. 


114 


Vss 


16 


BE2 


49 


N.C. 


82 


Vcc 


115 


LAD13 


17 


BE3 


50 


N.C. 


83 


Vcc 


116 


LAD14 


18 




51 


N.C. 


84 


Vss 


117 


LAD15 


FAILURE 


19 


Vss 


52 


Vss 


85 


Iac/InTo 


118 


LAD16 


20 




53 


Vss 


86 


INT1 


119 


LAD17 


LOCK 


21 


DEN 


54 


N.C. 


87 


INT2/INTR 


120 


LAD18 


22 


Vss 


55 


Vcc 


88 


INT3/INTA 


121 


LAD19 


23 


Vss 


56 


Vcc 


89 


N.C. 


122 


LAD20 


24 


N.C. 


57 


Vss 


90 


Vss 


123 


LAD21 


25 


N.C. 


58 


N.C. 


91 


CLK2 


124 


LAD22 


26 


Vss 


59 


N.C. 


92 


Vcc 


125 


Vss 


27 


Vss 


60 


N.C. 


93 


RESET 


126 


LAD23 


28 


N.C. 


61 


N.C. 


94 


N.C. 


127 


LAD24 


29 


Vcc 


62 


N.C. 


95 


N.C. 


128 


LAD25 


30 


Vcc 


63 


N.C. 


96 


N.C. 


129 




BADAC 


31 


N.C. 


64 


N.C. 


97 


N.C. 


130 


HOLD/HLDAR 


32 


Vss 


65 


N.C. 


98 


N.C. 


131 


N.C. 


33 


Vss 


66 


N.C. 


99 


v ss 


132 


ADS 
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Table 9. 80960KA Plastic Package Plnout— In Signal Order 


Signal 


Pin 


Signal 


Pin 


Signal 


Pin 


Signal 


Pin 


ADS 


132 


LAD22 


124 


N.C. 


49 


Vcc 


41 


ALE 


2 


LAD23 


126 


N.C. 


50 


Vcc 


55 




129 


LAD24 


127 


N.C. 


51 


Vcc 


56 


BADAC 


BEO 


14 


LAD25 


128 


N.C. 


54 


Vcc 


70 


BET 


15 


LAD26 


3 


N.C. 


58 


Vcc 


71 


BE2 


16 


LAD27 


4 


N.C. 


59 


Vcc 


74 


BE3 


17 


LAD28 


5 


N.C. 


60 


v cc 


82 


CACHE 


10 


LAD29 


6 


N.C. 


61 


Vcc 


83 


CLK2 


91 


LAD3 


104 


N.C. 


62 


Vcc 


92 


DEN 


21 


LAD30 


7 


N.C. 


63 


Vss 


9 


DT/R 


13 


LAD31 


8 


N.C. 


64 


v ss 


19 




18 


LAD4 


105 


N.C. 


65 


Vss 


22 


FAILURE 


HLDA/HOLDR 


1 


LAD5 


106 


N.C. 


66 


v ss 


23 


HOLD/HLDAR 


130 


LAD6 


107 


N.C. 


69 


Vss 


26 


Iac/InT6 


85 


LAD7 


108 


N.C. 


72 


Vss 


27 


INT1 


86 


LAD8 


109 


N.C. 


75 


Vss 


32 


INT2/INTR 


87 


LAD9 


110 


N.C. 


76 


Vss 


33 


INT3/INTA 


88 




20 


N.C. 


77 


v S s 


42 


LOCK 


LADO 


100 


N.C. 


24 


N.C. 


78 


Vss 


52 


LAD1 


101 


N.C. 


25 


N.C. 


81 


Vss 


53 


LAD10 


111 


N.C. 


28 


N.C. 


89 


Vss 


57 


LAD11 


112 


N.C. 


31 


N.C. 


94 


Vss 


67 


LAD12 


113 


N.C. 


34 


N.C. 


95 


Vss 


68 


LAD13 


115 


N.C. 


37 


N.C. 


96 


Vss 


73 


LAD14 


116 


N.C. 


38 


N.C. 


97 


Vss 


79 


LAD15 


117 


N.C. 


39 


N.C. 


98 


Vss 


80 


LAD16 


118 


N.C. . 


40 


N.C. 


131 


Vss 


84 


LAD17 


119 


N.C. 


43 




12 


v ss 


90 


READY 


LAD18 


120 


N.C. 


44 


RESET 


93 


Vss 


99 


LAD19 


121 


N.C. 


45 


Vcc 


29 


Vss 


103 


LAD2 


102 


N.C. 


46 


Vcc 


30 


Vss 


114 


LAD20 


122 


N.C. 


47 


Vcc 


35 


Vss 


125 


LAD21 


123 


N.C. 


48 


Vcc 


36 


W/R 


11 
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Table 10. 80960KA PGA Package Thermal Characteristics 



Thermal Resistance— °C/Watt 


Parameter 


Airflow— ft./min (m/sec) 



(0) 


50 
(0.25) 


100 
(0.50) 


200 
(1.01) 


400 
(2.03) 


600 
(3.04) 


800 
(4.06) 


Junction-to-Case 

(Case Measured 

as shown in Figure 26) 


2 


2 


2 


2 


2 


2 


2 


Case-to-Ambient 
(No Heatsink) 


19 


18 


17 


15 


12 


10 


9 


Case-to-Ambient 
(with Omnidirectional 
Heatsink) 


16 


15 


14 


12 


9 


7 


6 


Case-to-Ambient 
(with Unidirectional) 
Heatsink) 


15 


14 


13 


11 


8 


6 


5 



ODD 



XXXXXl 



w 
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NOTES: 

1 . This table applies to 80960KA PGA 3. 0j.cap = 4°C/w (approx.) 
plugged into socket or soldered di- 0j-pin = 4°C/w (inner pins) (approx.) 
rectly into board. 0J-PIN = 8°C/w (outer pins) (approx.) 

2. 0ja = 0JC + 0CA- 



Table 11. 80960KA PQFP Package Thermal Characteristics 



PQFP Thermal Resistance— °C/Watt 


Parameter 


Airflow — ft./min (m/sec) 



(0) 


50 
(0.25) 


100 
(0.50) 


200 
(1.01) 


400 
(2.03) 


600 
(3.04) 


800 
(4.06) 


Junction-to-Case 

(Case Measured 

as shown in Figure 26) 


9 


9 


9 


9 


9 


9 


9 


6 Case-to-Ambient 
(No Heatsink) 


22 


19 


18 


16 


11 


9 


8 



NOTES: 

1. This table applies to 80960KA 3. JL = 18°C/Watt 
PQFP soldered directly into board. JB =■ 18°C/Watt 

2- 0ja = 0JC + 0CA- 
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Figure 33. Read Transaction 
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Figure 34. Write Transaction with One Wait State 
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Figure 35. Burst Read Transaction 
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Figure 36. Burst Write Transaction with One Wait State 
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NOTE: 

INTR can go low no sooner than 5 ns (input hold time) following the beginning of interrupt acknowledgement cycle 1. 
For a second interrupt to be acknowledged, INTR must be low for at least three cycles before it can be reasserted. 




Figure 37. Interrupt Acknowledge Transaction 
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Figure 38. Bus Exchange Transaction (PBM = Primary Bus Master, SBM = Secondary Bus Master) 
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EMBEDDED 32-BIT PROCESSOR 

WITH INTEGRATED FLOATING-POINT UNIT 



High-Performance Embedded 
Architecture 

— 25 MIPS Burst Execution at 25 MHz 

— 9.4 MIPS* Sustained Execution at 
25 MHz 

On-Chip Floating-Point Unit 

— Supports IEEE 754 Standard 

— Four 80-Bit Registers 

— 5.2 Million Whetstones/s at 
25 MHz 

512-Byte On-Chip Instruction Cache 

— Direct Mapped 

— Parallel Load/Decode for Uncached 
Instructions 

4 Gigabyte, Linear Address Space 

132- Lead PGA and PQFP Packages 



m Multiple Register Sets 

— Sixteen Global 32-Bit Registers 

— Sixteen Local 32-Bit Registers 

— Four Local Register Sets Stored 
On-Chip 

— Register Scoreboarding 

m Built-in Interrupt Controller 

— 32 Priority Levels 256 Vectors 

— 3.4 fxs Latency 

m Easy to Use, High Bandwidth 32-Bit Bus 

— 66.7 Mbytes/s Burst 

— Up to 16-Bytes Transferred per Burst 

m Uses 85C960 Bus Controller 

El Supported by 27960KX Burst EPROMs 




The 80960KB is the first member of Intel's new 32-bit processor family, the i960 series, which is designed 
especially for embedded applications. It is based on the family's high performance, common core architecture, 
and includes a 512-byte instruction cache, a built-in interrupt controller, and an integrated floating-point unit. 
The 80960KB has a large register set, multiple parallel execution units and a high-bandwidth, burst bus. Using 
advanced RISC technology, this high performance processor is capable of execution rates in excess of 9.4 
million instructions per second.* The 80960KB is well-suited for a wide range of embedded applications, 
including laser printers, image processing, industrial control, robotics and telecommunications. 

♦Relative to Digital Equipment Corporation's VAX-1 1/780** at 1 MIPS 



4 80-BIT 

FP 
REGISTERS 



80- BIT 
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INSTRUCTION 
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16 32-BIT 
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INSTRUCTION 

CACHE 



r ti 



64 by 32-BIT 

LOCAL 

REGISTER 

CACHE 



32-BIT 
IEU 



INSTRUCTION 
DECODER 



MICRO- 
INSTRUCTION 
SEQUENCER 



MICRO- 
INSTRUCTION 
ROM 



4 



BUS 

CONTROL 

LOGIC 

AND 

INTERRUPT 

CONTROLLER 



32-BIT 
BURST 
BUS 



Figure 1. The 80960KB's Highly Parallel Microarchitecture 

**VAX-1 1tm is a trademark of Digital Equipment Corporation. 
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THE 960 SERIES 

The 80960KB is the first member of a new family of 
32-bit microprocessors from Intel known as the 960 
Series. This series was especially designed to serve 
the needs of embedded applications. The embed- 
ded market includes applications as diverse as in- 
dustrial automation, avionics, image processing, 
graphics, robotics, telecommunications and automo- 
biles. These types of applications require high 
integration, low power consumption, quick interrupt 
response times and high performance. Since time to 
market is critical, embedded microprocessors need 
to be easy to use in both hardware and software 
designs. 



AH members of the 80960 series share a common 
core architecture which utilizes RISC technology so 
that, except for special functions, the family mem- 
bers are object code compatible. Each new proces- 
sor in the series will add its own special set of func- 
tions to the core to satisfy the needs of a specific 
application or range of applications in the embedded 
market. For example, future processors may include 
a DMA controller, a timer or an A/D converter. 

The 80960KB includes an integrated floating-point 
unit. Intel also offers a pin-compatible version, called 
the 80960KA, without an FPU, and a military-grade 
version, the 80960MC, with support for memory 
management, mutitasking, multiprocessing and fault 
tolerance. 



go 

gl5 

fpO 

fp3 

rO 

M5 



SIXTEEN 

32-BIT 

REGISTERS 



GLOBAL 
REGISTERED' 



FOUR 80-BIT REGISTERS 



FLOATING- 
POINT 
REGISTERS 



SIXTEEN 

32-BIT 

REGISTERS 



32-BITS 



LOCAL 
REGISTERS(2) 



ARITHMETIC CONTROLS 



32-BITS INSTRUCTION POINTER 



32-BITS 



232-1 



ADDRESS 
SPACE 



PROCESS CONTROLS 



32-BITS TRACE CONTROLS 



NOTES: 

1. Register g15 is reserved for stack management functions. 

2. Registers r0, r1 , and r2 are reserved for stack management functions. 



Figure 2. Register Set 
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KEY PERFORMANCE FEATURES 

The 80960KB's architecture is based on the most 
recent advances in RISC technology and is ground- 
ed in Intel's long experience in designing embedded 
controllers. Many features contribute to the 
80960KB's exceptional performance: 

1. Large Register Set. Having a large number of 
registers reduces the number of times that a proces- 
sor needs to access memory. Modern compilers can 
take advantage of this feature to optimize execution 
speed. For maximum flexibility, the 80960KB pro- 
vides 32 32-bit registers and four 80-bit floating- 
point registers. (See Figure 2.) 

2. Fast Instruction Execution. Simple functions 
make up the bulk of instructions in most programs, 



so that execution speed can be greatly improved by 
ensuring that these core instructions execute in as 
short a time as possible. The most-frequently exe- 
cuted instructions such as register-register moves, 
add/subtract, logical operations, and shifts execute 
in one to two cycles (Table 1 contains a list of in- 
structions.) 

3. Load/Store Architecture. Like other processors 
based on RISC technology, the 80960KB has a 
Load/Store architecture, only the LOAD and STORE 
instructions reference memory; all other instructions 
operate on registers. This type of architecture simpli- 
fies instruction decoding and is used in combination 
with other techniques to increase parallelism. 
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Figure 3. Instruction Formats 
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Table 1. 80960KB Instruction Set 



Data Movement 


Arithmetic 


Logical 


Bit and Bit 
Field 


Load 


Add 


And 


Set Bit 


Store 


Subtract 


Not And 


Clear Bit 


Move 


Multiply 


And Not 


Not Bit 


Load Address 


Divide 


Or 


Check Bit 




Remainder 


Exclusive Or 


Alter Bit 




Modulo 


Not Or 


Scan for Bit 




Shift 


Or Not 


Scan over Bit 




Extended Multiply 


Nor 


Extract 




Extended Divide 


Exclusive Nor 
Not 
Nand 
Rotate 


Modify 


Comparison 


Branch 


Call/Return 


Fault 


Compare 


Unconditional 


Call 


Conditional Fault 


Conditional 


Branch 


Call Extended 


Synchronize Faults 


Compare 


Conditional Branch 


Call System 




Compare and 


Compare and 


Return 




Increment 


Branch 


Branch and Link 




Compare and 








Decrement 








Debug 


Miscellaneous 


Decimal 


Modify Trace 


Atomic Add 


Move 


Controls 


Atomic Modify 


Add with Carry 




Mark 


Flush Local Registers 


Subtract with Carry 




Force Mark 


Modify Arithmetic 

Controls 
Modify Process Controls 
Scan Byte for Equal 
Test Condition Code 






Conversion 


Floating-Point 


Synchronous 


Convert Real to Integer 


Move Real 


Synchronous Load 


Convert Integer to Real 


Add 

Subtract 

Multiply 

Divide 

Remainder 

Scale 

Round 

Square Root 

Sine 

Cosine 

Tangent 

Arctangent 

Log 

Log Binary 

Log Natural 

Exponent 

Classify 

Copy Real Extended 

Compare 


Synchronous Move 
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4. Simple Instruction Formats. All instructions in 
the 80960KB are 32-bits long and must be aligned 
on word boundaries. This alignment makes it possi- 
ble to eliminate the instruction-alignment stage in 
the pipeline. To simplify the instruction decoder fur- 
ther, there are only five instruction formats and each 
instruction uses only one format. (See Figure 3.) 

5. Overlapped Instruction Execution. A load oper- 
ation allows execution of subsequent instructions to 
continue before the data has been returned from 
memory, so that these instructions can overlap the 
load. The 80960KB manages this process transpar- 
ently to software through the use of a register score- 
board. Conditional instructions also make use of a 
scoreboard so that subsequent unrelated instruc- 
tions can be executed while the conditional instruc- 
tion is pending. 

6. Integer Execution Optimization. When the re- 
sult of an operation is used as an operand in a sub- 
sequent calculation, the value is sent immediately to 
its destination register. Yet at the same time, the 
value is put back on a bypass path to the ALU, 
thereby saving the time that otherwise would be re- 
quired to retrieve the value for the next operation. 

7. Bandwidth Optimizations. The 80960KB gets 
optimal use of its memory bus bandwidth because 
the bus is tuned for use with the cache: the line size 
of the instruction cache matches the maximum burst 
size for instruction fetches. The 80960KB automati- 
cally fetches four words in a burst and stores them 
directly in the cache. Due to the size of the cache 
and the fact that it is continually filled in anticipation 
of needed instructions in the program flow, the 
80960KB is exceptionally insensitive to memory wait 
states. In fact, each wait state causes only a 7% 
degradation in system perfomance. The benefit is 
that the 80960KB will deliver outstanding perform- 
ance even with a low cost memory system. 

8. Cache Bypass. If there is a cache miss, the proc- 
essor fetches the needed instruction, then sends it 
on to the instruction decoder at the same time it 
updates the cache. Thus, no extra time is taken to 
load and read the cache. 



to ensure efficient compiler implementations of high- 
level languages such as C, Fortran and Ada. Table 2 
lists the memory addressing modes. 



Data Types 

The 80960KB recognizes the following data types: 

Numeric: 

® 8-, 16-, 32- and 64-bit ordinals 
• 8-, 16, 32- and 64-bit integers 
o 32-, 64- and 80-bit real numbers 

Non-Numeric: 

o Bit 

o Bit Field 

o Triple-Word (96 bits) 

o Quad-Word (128 bits) 



Large Register Set 

The programming environment of the 80960KB in- 
cludes a large number of registers. In fact, 36 regis- 
ters are available at any time. The availability of this 
many registers greatly reduces the number of mem- 
ory accesses required to execute most programs, 
which leads to greater instruction processing speed. 

There are two types of general-purpose registers: 
local and global. The 20 global registers consist of 
sixteen 32-bit registers (GO through G15) and four 
80-bit registers (FP0 through FP3). These registers 
perform the same function as the general-purpose 
registers provided in other popular microprocessors. 
The term global refers to the fact that these regis- 
ters retain their contents across procedure calls. 

The local registers, on the other hand, are proce- 
dure specific. For each procedure call, the 80960KB 
allocates 16 local registers (R0 through R15). Each 
local register is 32 bits wide. Any register can also 
be used for single or double-precision floating-point 
operations; the 80-bit floating-point registers are pro- 
vided for extended precision. 




Memory Space and Addressing Modes 

The 80960KB offers a linear programming environ- 
ment so that all programs running on the processor 
are contained in a single address space. The maxi- 
mum size of the address space is 4 Gigabytes (2 32 
bytes). 

For ease of use, the 80960KB has a small number of 
addressing modes, but includes all those necessary 



Multiple Register Sets 

To further increase the efficiency of the register set, 
multiple sets of local registers are stored on-chip. 
This cache holds up to four local register frames, 
which means that up to three procedure calls can be 
made without having to access the procedure stack 
resident in memory. 

Although programs may have procedure calls nest- 
ed many calls deep, a program typically oscillates 
back and forth between only two or three levels. As 
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Table 2. Memory Addressing Modes 



» 12-Bit Offset 

► 32-Bit Offset 

► Register-Indirect 

► Register + 12-Bit Offset 

► Register + 32-Bit Offset 

» Register + (Index-Register x Scale-Factor) 

► Register x Scale Factor + 32-Bit Displacement 

► Register + (Index-Register x Scale-Factor) + 32-Bit Displacement 

Scale-Factor is 1, 2, 4, 8 or 16 



a result, with four stack frames in the cache, the 
probability of there being a free frame on the cache 
when a call is made is very high. In fact, runs of 
representative C-language programs show that 80% 
of the calls are handled without needing to access 
memory. 

If there are four or more active procedures and a 
new procedure is called, the processor moves the 
oldest set of local registers in the register cache to a 



procedure stack in memory to make room for a new 
set of registers. Global register G15 is used by the 
processor as the frame pointer (FP) for the proce- 
dure stack. 

Note that the global and floating-point registers are 
not exchanged on a procedure call, but retain their 
contents, making them available to all procedures 
for fast parameter passing. An illustration of the reg- 
ister cache is shown in Figure 4. 



REGISTER 
CACHE 



ONE OF FOUR 

LOCAL 
REGISTER SETS 



LOCAL REGISTER SET 




31 



<15 

270565-2 



Figure 4. Multiple Register Sets Are Stored On-Chip 
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Instruction Cache 

To further reduce memory accesses, the 80960KB 
includes a 512-byte on-chip instruction cache. The 
instruction cache is based on the concept of locality 
of reference; that is, most programs are not usually 
executed in a steady stream but consist of many 
branches and loops that lead to jumping back and 
forth within the same small section of code. Thus, by 
maintaining a block of instructions in a cache, the 
number of memory references required to read in- 
structions into the processor can be greatly reduced. 

To load the instruction cache, instructions are 
fetched in 16-byte blocks, so that up to four instruc- 
tions can be fetched at one time. An efficient 
prefetch algorithm increases the probability that an 
instruction will already be in the cache when it is 
needed. 

Code for small loops will often fit entirely within the 
cache, leading to a great increase in processing 
speed since further memory references might not be 
necessary until the program exits the loop. Similarly, 
when calling short procedures, the code for the call- 
ing procedure is likely to remain in the cache, so it 
will be there on the procedure's return. 



Register Scoreboarding 

The instruction decoder has been optimized in sev- 
eral ways. One of these optimizations is the ability to 
do instruction overlapping by means of register 
scoreboarding. 

Register scoreboarding occurs when a LOAD in- 
struction is executed to move a variable from memo- 
ry into a register. When the instruction is initiated, a 
scoreboard bit on the target register is set. When the 
register is actually loaded, the bit is reset. In be- 
tween, any reference to the register contents is ac- 
companied by a test of the scoreboard bit to insure 
that the load has completed before processing con- 
tinues. Since the processor does not have to wait for 
the LOAD to be completed, it can go on to execute 
additional instructions placed in between the LOAD 
instruction and the instruction that uses the register 
contents, as shown in the following example: 

LOAD R4, address 1 
LOAD R5, address 2 
Unrelated instruction 
Unrelated instruction 
ADD R4, R5, R6 



In essence, the two unrelated instructions between 
the LOAD and ADD instructions are executed for 
free (i.e., take no apparent time to execute) because 
they are executed while the register is being loaded. 
Up to three LOAD instructions can be pending at 
one time with three corresponding scoreboard bits 
set. By exploiting this feature, system programmers 
and compilers have a useful tool for optimizing exe- 
cution speed. 



Floating-Point Arithmetic 

In the 80960KB, floating-point arithmetic has been 
made an integral part of the architecture. Having the 
floating-point unit integrated on-chip provides two 
advantages. First, it improves the performance of 
the chip for floating-point applications, since no 
additional bus overhead is associated with floating- 
point calculations, thereby leaving more time for oth- 
er bus operations such as I/O. Second, the cost of 
using floating-point operations is reduced because a 
separate coprocessor chip is not required. 

The 80960KB floating-point (real number) data types 
include single-precision (32-bit), double-precision 
(64-bit), and extended precision (80-bit) floating- 
point numbers. Any register may be used to execute 
floating-point operations. 

The processor provides hardware support for both 
mandatory and recommended portions of IEEE 
Standard 754 for floating-point arithmetic, including 
all arithmetic, exponential, logarithmic, and other 
transcendental functions. Table 3 shows execution 
times for some representative instructions. 

Table 3. Sample Floating-Point Execution 
Times (jlls) at 25 MHz 




Add 


32-Bit 


64-Bit 


0.4 


0.5 


Subtract 


0.4 


0.5 


Multiply 


0.7 


1.3 


Divide 


1-3. 


2.9 


Square Root 


3.7 


3.9 


Arctangent 


10.1 


13.1 


Exponent 


11.3 


12.5 


Sine 


15.2 


16.6 


Cosine 


15.2 


16.6 
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High Bandwidth Local Bus 

An 80960KB CPU resides on a high-bandwidth ad- 
dress/data bus known as the local bus (L-Bus). The 
L-Bus provides a direct communication path be- 
tween the processor and the memory and I/O sub- 
system interfaces. The processor uses the local bus 
to fetch instructions, manipulate memory, and re- 
spond to interrupts. Its features include: 

• 32-bit multiplexed address/data path 

• Four-word burst capability, which allows transfers 
from 1 to 16 bytes at a time 

• High bandwidth reads and writes at 66.7 Mbytes 
per second 

• Special signal to indicate whether a memory 
transaction can be cached 

Figure 5 identifies the groups of signals which con- 
stitute the L-Bus. Table 4 lists the function of the L- 
Bus and other processor-support signals, such as 
the interrupt lines. 



Interrupt Handling 

The 80960KB can be interrupted in one of two ways: 
by the activation of one of four interrupt pins or by 
sending a message on the processor's data bus. 

The 80960KB is unusual in that it automatically han- 
dles interrupts on a priority basis and tracks pending 
interrupts through its on-chip interrupt controller. 
Two of the interrupt pins can be configured to pro- 
vide 8259A handshaking for expansion beyond four 
interrupt lines. 



Debug Features 

The 80960KB has built-in debug capabilities. There 
are two types of breakpoints and six different trace 
modes. The debug features are controlled by two 
internal 32-bit registers, the Process-Controls Word 
and the Trace-Controls Word. By setting bits in 
these control words, a software debug monitor can 
closely control how the processor responds during 
program execution. 

The 80960KB has both hardware and software 
breakpoints. It provides two hardware breakpoint 
registers on-chip which can be set by a special com- 
mand to any value. When the instruction pointer 
matches the value in one of the breakpoint registers, 
the breakpoint will fire, and a breakpoint handling 
routine is called automatically. 

The 80960KB also provides software breakpoints 
through the use of two instructions, MARK and 
FMARK. These instructions can be placed at any 
point in a program and will cause the processor to 
halt execution at that point and call the breakpoint 
handling routine. The breakpoint mechanism is easy 
to use and provides a powerful debugging tool. 

Tracing is available for instructions (single-step exe- 
cution), calls and returns, and branching. Each dif- 
ferent type of trace may be enabled separately by a 
special debug instruction. In each case, the 
80960KB executes the instruction first and then 
calls a trace handling routine (usually part of a soft- 
ware debug monitor). Further program execution is 
halted until the trace routine is completed. When the 
trace event handling routine is completed, instruc- 
tion execution resumes at the next instruction. The 
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Figure 5. Local Bus Signal Groups 
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80960KB's tracing mechanisms, which are imple- 
mented completely in hardware, greatly simplify the 
task of testing and debugging software. 



FAULT DETECTION 

The 80960KB has an automatic mechanism to 
handle faults. There are ten fault types including 
trace, arithmetic, and floating-point faults. When the 
processor detects a fault, it automatically calls the 
appropriate fault handling routine and saves the cur- 
rent instruction pointer and necessary state informa- 
tion to make efficient recovery possible. The proces- 
sor posts diagnostic information on the type of fault 
to a Fault Record. Like interrupt handling routines, 
fault handling routines are usually written to meet 
the needs of a specific application and are often in- 
cluded as part of the operating system or kernel. 

For each of the ten fault types, there are numerous 
subtypes that provide specific information about a 
fault. For example, a floating-point fault may have its 
subtype set to an Overflow or Zero-Divide fault. The 
fault handler can use this specific information to re- 
spond correctly to the fault. 



BUILT-IN TESTABILITY 

Upon reset, the 80960KB automatically conducts an 
extensive internal test (self-test) of its major blocks 



of logic. Then, before executing its first instruction, it 
does a zero check sum on the first eight words in 
memory to ensure that the system has been loaded 
correctly. If a problem is discovered at any point dur- 
ing the self-test, the 80960KB will assert its FAIL- 
URE pin and will not begin program execution. The 
self-test takes approximately 47,000 cycles to com- 
plete. 

System manufacturers can use the 80960KB's self- 
test feature during incoming parts inspection. No 
special diagnostic programs need to be written, and 
the test is both thorough and fast. The self-test ca- 
pability helps ensure that defective parts will be dis- 
covered before systems are shipped, and once in 
the field, the self-test makes it easier to distinguish 
between problems caused by processor failure and 
problems resulting from other causes. 



CHMOS 

The 80960KB is fabricated using Intel's CHMOS IV 
(Complementary High Speed Metal Oxide Semicon- 
ductor) process. This advanced technology elimi- 
nates the frequency and reliability limitations of older 
CMOS processes and opens a new era in micro- 
processor performance. It combines the high per- 
formance capabilities of Intel's industry-leading 
HMOS technology with the high density and low 
power characteristics of CMOS. The 80960KB is 
available at 10, 16, 20 and 25 MHz. 




Table 4a. 80960KB Pin Description: L-Bus Signals 



Symbol 


Type 


Name and Function 


CLK2 


I 


SYSTEM CLOCK provides the fundamental timing for 80960KB systems. It is 
divided by two inside the 80960KB to generate the internal processor clock. 


LAD31 
-LAD 


I/O 
T.S. 


LOCAL ADDRESS/DATA BUS carries 32-bit physical addresses and data to and 
from memory. During an address (T a ) cycle, bits 2-31 contain a physical word 
address (bits 0-1 indicate SIZE; see below). During a data (T^) cycle, bits 0-31 
contain read or write data. The LAD lines are active HIGH and float to a high 
impedance state when not active. 

SIZE, which is comprised of bits 0-1 of the LAD lines during a T a cycle, specifies 
the size of a burst transfer in words. 

LAD 1 LAD 

1 Word 

1 2 Words 

1 3 Words 
1 1 4 Words 


ALE 


O 
T.S. 


ADDRESS-LATCH ENABLE indicates the transfer of a physical address. ALE is 
asserted during a T a cycle and deasserted before the beginning of the Td state. It 
is active LOW and floats to a high impedance state during a hold cycle (Th or Th r ). 



I/O = Input/Output, O = Output, I = Input, O.D. = Open-Drain, T.S. = tri-state 
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Table 4a. 80960KB Pin Description: L-Bus Signals (Continued) 



Symbol 


Type 


Name and Function 


ADS 



O.D. 


ADDRESS/DATA STATUS indicates an address state. ADS is asserted every T a 
state and deasserted during the the following 7$ state. For a burst transaction, 


ADS is asserted again every Td state where READY was asserted in the previous 
cycle. 


W/R 



O.D. 


WRITE/READ specifies, during a T a cycle, whether the operation is a write or 
read. It is latched on-chip and remains valid during Td cycles. 


DT/R 



O.D. 


DATA TRANSMIT/RECEIVE indicates the direction of data transfer to and from 
the L-Bus. It is low during T a and Td cycles for a read or interrupt 
acknowledgement; it is high during T a and Td cycles for a write. DT/R never 
changes state when DEN is asserted (see Timing Diagrams). 


DEN 



O.D. 


DATA ENABLE is asserted during Td cycles and indicates transfer of data on the 
LAD bus lines. 




1 




READY 


READY indicates that data on LAD lines can be sampled or removed. If READY is 
not asserted during a Td cycle, the_Jd cycle is extended to the next cycle by 
inserting a wait state (Tw)> an d ADS is not asserted in the next cycle. 


LOCK 


I/O 
O.D. 


BUS LOCK prevents other bus masters from gaining control of the L-Bus 
following the current cycle (if they would assert LOCK to do so). LOCK is used by 
the processor or any bus agent when it performs indivisible Read/ Modify/ Write 
(RMW) operations. Do not leave LOCK unconnected. It must be pulled high for the 
processor to function properly. 


For a read that is designated as a RMW-read, LOCK is examined, if asserted, the 
processor waits until it is not asserted; if not asserted, the processor asserts 
LOCK during the T a cycle and leaves it asserted. 

A write that is designated as an RMW-write deasserts LOCK in the T a cycle. 
During the time LOCK is asserted, a bus agent can perform a normal read or write 
but no RMW operations. LOCK is also held asserted during an interrupt- 
acknowledge transaction. 


BES-BE5 



O.D. 


BYTE ENABLE LINES specify which data bytes (up to four) on the bus take part 
in the current bus cycle. BE3 corresponds to LAD31 -LAD24 and BEq corresponds 

to LAD7-LAD0. 

The byte enables are provided in advance of data. The byte enables asserted 
during T a specify the bytes of the first data word. The byte enables asserted 
during Td specify the bytes of the next data Word (if any), that is, the word to be 
transmitted following the next assertion of READY. The byte enables during the 
Td cycles preceding the last assertion of READY are undefined. The byte enables 
are latched on-chip and remain constant from one Td cycle to the next when 
READY is not asserted. 

For reads, the byte enables specify the byte(s) that the processor will actually use. 
L-Bus agents are required to assert only adjacent byte enables (e.g., asserting just 
BEq and BE2 is not permitted), and are required to assert at least one byte enable. 
To produce address bits Aq and A-| externally, they can be decoded from the byte 
enables. 



I/O = Input/Output, O = Output, I = Input, O.D. = Open-Drain, T.S. = tri-state 
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Table 4a. 80960KB Pin Description: L-Bus Signals (Continued) 



Symbol 


Type 


Name and Function 


HOLD/ 
HLDAR 


1 


HOLD: If the processor is the primary bus master (PBM), the input is interpreted 
as HOLD, a request from a secondary bus master to acquire the bus. When the 
processor receives HOLD and grants another master control of the bus, it floats 
its tri-state bus lines and then asserts HLDA and enters the Th state. When HOLD 
is deasserted, the processor will deassert HLDA and go to either the Tj or T a 
state. 

HOLD ACKNOWLEDGE RECEIVED: If the processor is a secondary bus master 
(SBM), the input is HLDAR, which indicates, when HOLDR output is high, that the 
processor has acquired the bus. Processors and other agents can be told at reset 
if they are the primary bus master (PBM). 


HLDA/ 
HOLDR 



T.S. 


HOLD ACKNOWLEDGE: If the processor is a primary bus master, the output is 
HLDA, which relinquishes control of the bus to another bus master. 

HOLD REQUEST: For secondary bus masters (SBM), the output is HOLDR, which 
is a request to acquire the bus. The bus is said to be acquired if the agent is a 
primary bus master and does not have its HLDA output asserted, or if the agent is 
a secondary bus master and has its HOLD input and HLDA output asserted. 


CACHE 



T.S. 


CACHE indicates if an access is cacheable during a T a cycle. It is not asserted 
during any synchronous access, such as a synchronous load or move instruction 
used for sending an IAC message. The CACHE signal floats to a high impedance 
state when the processor is idle. 



Table 4b. 80960KB Pin Description: Module Support Signals 



Symbol 


Type 


Name and Function 


BADAC 


1 


BAD ACCESS, if asserted in the cycle following the one in which the last READY 
of a transaction is asserted, indicates that an unrecoverable error has occurred on 
the current bus transaction, or that a synchronous load/store instruction has not 
been acknowledged. 


STARTUP: During system reset, the BADAC signal is interpreted differently. If the 
signal is high, it indicates that this processor will perform system initialization. If it 
is low, another processor in the system will perform system initialization instead. 


RESET 


1 


RESET clears the internal logic of the processor and causes it to re-initialize. 


During RESET assertion, the input pins are ignored (except for BADAC and 
lAC/INTrj), the tri-state output pins are placed in a high impedance state, and 
other output pins are placed in their non-asserted state. 

RESET must be asserted for at least 41 CLK2 cycles for a predictable RESET. 
The HIGH to LOW transition of RESET should occur after the rising edge of both 
CLK2 and the external bus CLK, and before the next rising edge of CLK2. 




O 
O.D. 


INITIALIZATION FAILURE indicates that the processor has failed to initialize 
correctly. After RESET is deasserted and before the first bus transaction begins, 
FAILURE is asserted while the processor performs a self-test. If the self-test 
. completes successfully, then FAILURE is deasserted. Next, the processor 


FAILURE 


performs a zero checksum on the first eight words of memory. If it fails, FAILURE 
is asserted for a second time and remains asserted; if it passes, system 


initialization continues and FAILURE remains deasserted. 


N.C. 


N/A 


NOT CONNECTED indicates pins should not be connected. Never connect any 
pin marked N.C. 



I/O = Input/Output, O = Output, I = Input, O.D. = Open-Drain, T.S. = tri-state 
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Table 4b. 80960KB Pin Description: Module Support Signals (Continued) 



Symbol 


Type 


Name and Function 


IAC 
INTO 


I 


INTERAGENT COMMUNICATION REQUEST/INTERRUPT indicates either 
that there is a pending IAC message for the processor or an interrupt. The bus 
interrupt control register determines in which way the signal should be interpreted. 
To signal an interrupt or IAC request in a synchronous system, this pin (as well as 
the other interrupt pins) must be enabled by being deasserted for at least one bus 
cycle and then asserted for at least one additional bus cycle; in an asynchronous 
system, the pin must remain deasserted for at least two bus cycles and then be 
asserted for at least two more bus cycles. 

LOCAL PROCESSOR NUMBER: This signal is interpreted differently during 
system reset. If the signal is at a high voltage level, it indicates that this processor 
is a primary bus master (Local Processor Number = 0); if it is at a low voltage 
level, it indicates that this processor is a secondary bus master (Local Processor 
Number = 1). 


INT1 


I 


INTERRUPT 1 , like INTO, provides direct interrupt signaling. 


INT2/ 
INTR 


I 


INTERRUPTS/INTERRUPT REQUEST: The bus control registers determines 
how this pin is interpreted. If INT2, it has the same interpretation as the INTO and 
INT1 pins. If INTR, it is used to receive an interrupt request from an external 
interrupt controller. 


INT3/ 
INTA 


I/O 
O.D. 


INTERRUPT 3/INTERRUPT ACKNOWLEDGE: The bus interrupt control register 
determines how this pin is interpreted. If INT3, it has the same interpretation as 
the INTO, INT1 , and INT2 pins. If INTA, it is used as an output to control interrupt- 
acknowledge bus transactions. The INTA output is latched on-chip and remains 
valid during Tj cycles; as an output, it is open-drain. 



I/O = Input/Output, O = Output, I = Input, O.D. = Open-Drain, T.S. = tri-state 



ELECTRICAL SPECIFICATIONS 



Power and Grounding 

The 80960KB is implemented in CHMOS IV technol- 
ogy and has modest power requirements. Its high 
clock frequency and numerous output buffers (ad- 
dress/data, control, error, and arbitration signals) 
can cause power surges as multiple output buffers 
drive new signal levels simultaneously. For clean on- 
chip power distribution at high frequency, 12 Vcc 
and 1 3 Vss pins separately feed functional units of 
the 80960KB in the PGA. 

Power and ground connections must be made to all 
power and ground pins of the 80960KB. On the cir- 
cuit board, all Vcc pins must be strapped closely 
together, preferably on a power plane. Likewise, all 
Vss P' ns should be strapped together, preferably on 
a ground plane. These pins may not be connected 
together within the chip. 



Power Decoupling Recommendations 

Liberal decoupling capacitance should be placed 
near the 80960KB. The processor can cause tran- 
sient power surges when driving the L-Bus, particu- 
larly when it is connected to a large capacitive load. 

Low inductance capacitors and interconnects are 
recommended for best high frequency electrical per- 
formance. Inductance can be reduced by shortening 
the board traces between the processor and de- 
coupling capacitors as much as possible. Capacitors 
specifically designed for PGA packages are also 
commercially available and offer the lowest possible 
inductance. 



Connection Recommendations 

For reliable operation, always connect unused in- 
puts to an appropriate signal level. In particular, if 
one or more interrupt lines are not used, they should 
be pulled up. No inputs should ever be left floating. 
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All open-drain outputs require a pullup device. While 
in some cases a simple pullup resistor will be ade- 
quate, we recommend a network of pullup and pull- 
down resistors biased to a valid Vm (^3.4V) and 
terminated in the characteristic impedance of the cir- 
cuit board. Figure 6 shows our recommendations for 
the resistor values for both a low and high current 
drive network, which assumes that the circuit board 
has a characteristic impedance of 1 00ft. The advan- 
tage of terminating the output signals in this fashion 
is that it limits signal swing and reduces AC power 
consumption. 



Characteristic Curves 

Figure 7 shows the typical supply current require- 
ments over the operating temperature range of the 
processor at supply voltage (Vcc) of 5V. Figure 8 
shows the typical power supply current (Ice) re- 
quired by the 80960KB at various operating frequen- 
cies when measured at three input voltage (Vcc) 
levels. 

For a given output current (Iol). the curve in Figure 9 
shows the worst case output low voltage (Vol)- 
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Figure 10 shows the typical capacitive derating 
curve for the 80960KB measured from 1 .5V on the 
system clock (CLK) to 1 .5V on the falling edge and 
1.5V on the rising edge of the L-Bus address/data 
(LAD) signals. 



Test Load Circuit 

Figure 1 3 illustrates the load circuit used to test the 
80960KB's tristate pins, and Figure 14 shows the 
load circuit used to test the open drain outputs. The 
open drain test uses an active load circuit in the form 
of a matched diode bridge. Since the open-drain out- 
puts sink current, only the Iol legs of the bridge are 
necessary and the Ioh 'egs are not used. When the 
80960KB driver under test is turned off, the output 
pin is pulled up to Vref 0- e -» Vqh)- Diode D-| is 
turned off and the Iol current source flows through 
diode D2. 

When the 80960KB open-drain driver under test is 
on, diode D-| is also on, and the voltage on the pin 
being tested drops to Vql- Diode D 2 turns off and 
Iql flows through diode D-|. 
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Figure 6. Connection Recommendations for Low and High Current Drive Networks 
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Figure 7. Typical Supply Current (Ice) 
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Figure 8. Typical Current vs Frequency 
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(Temp = +85°C, V C c = 4.5V) 
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Figure 9. Worst Case Voltage vs Output 
Current on Open-Drain Pins 



Figure 10. Capacitive Derating Curve 



3-94 



Intel. 



80960KB 



ABSOLUTE MAXIMUM RATINGS" 

Operating Temperature 0°C to + 85°C Case 

Storage Temperature -65°C to + 1 50°C 

Voltage on Any Pin - 0.5V to V C c + 0.5V 

Power Dissipation 2.5W (25 MHz) 



NOTICE: This is a production data sheet. The specifi- 
cations are subject to change without notice. 



* WARNING: Stressing the device beyond the "Absolute 
Maximum Ratings" may cause permanent damage. 
These are stress ratings only. Operation beyond the 
"Operating Conditions" is not recommended and ex- 
tended exposure beyond the "Operating Conditions" 
may affect device reliability. 



DC CHARACTERISTICS 



PGA: 

80960KB (16 MHz): T CAS e = 0°C to + 85°C, V C c = 5V ± 10% 
80960KB (20 and 25 MHz): T C ASE = 0°C to + 85°C, V C c = 5V ± 5% 



PQFP: 

80960KA(10 
80960 KA (20 



and 16 MHz): T C ASE = 0°C to + 100°C, V C c = 5V ±10% 
MHz): Tcase = 0°C to + 100°C, V C c = 5V ±5% 



Symbol 


Parameter 


Min 


Max 


Units 


Test Conditions 


V|L 


Input Low Voltage 


-0.3 


+ 0.8 


V 




V| H 


Input High Voltage 


2.0 


V CC + 0.3 


V 




V C L 


CLK2 Input Low Voltage 


-0.3 


+ 0.8 


V 




V C H 


CLK2 Input High Voltage 


0.55 V CC 


V CC + 0.3 


V 




Vol 


Output Low Voltage 




0.45 


V 


(1,5) 


VOH 


Output High Voltage 


2.4 




V 


(2, 4) 


ice 


Power Supply Current: 
10 MHz 
16 MHz 
20 MHz 
25 MHz 




300 
375 
420 
480 


mA 
mA 
mA 
mA 




Ili 


Input Leakage Current 




±15 


/aA 


0<;v IN <; v cc 


Ilo 


Output Leakage Current 




±15 


JLtA 


0.45 ^ V ^ Vcc 


Gin 


Input Capacitance 




10 


PF 


f c = 1 MHz(3) 


Co 


I/O or Output Capacitance 




12 


pF 


f c = 1 MHzO) 


CCLK 


Clock Capacitance 




10 


PF 


f c = 1 MHzO) 



NOTES: 

1 . For tri-state outputs, this parameter is measured at: 

Address/ Data 

Controls 

2. This parameter is measured at: 

Address/Data . .. 

Controls 

ALE 

3. Input, output, and clock capacitance are not tested. 

4. Not measured on open-drain outputs. 

5. For open-drain outputs 




.4.0 mA 
.5.0 mA 



.-1.0 mA 
.-0.9 mA 
.-5.0 mA 



.25 mA 
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AC SPECIFICATIONS 

This section describes the AC specifications for the 
80960KB pins. All input and output timings are spec- 
ified relative to the 1.5V level of the rising edge. For 
output timings, the specifications refer to the time it 
takes the signal to reach 1 .5V. 



For input timings, the specifications refer to the time 
at which the signal reaches (for input setup) or 
leaves (for hold time) the TTL levels of LOW (0.8V) 
or HIGH (2.0V). All AC testing should be done with 
input clock voltages of 0.4V and 2.4V, except for the 
clock (CLK2), which should be tested with input volt- 
ages of 0.45 Vcc and 0.55 Vcc- 



EDGE 



CLK2 



0.8V ■ 



OUTPUTS: 
LAD 31 -LAD , 

ADS, 

W/R.DEN, 

BE 3 -BE 

HLDA/HOLDR, 

CACHE 

L0CK.INTA 



ALE 



DT/R 



INPUTS: 

LAD 3 i-UD 0> 
BADAC, 

iAC7iNT » ,NT 1» 

INT 2 /INTR,iNT 3 



HQLD.HLDAR, 

LOCK, 

READY 



.-y~\ 




Figure 1 1. Drive Levels and Timing Relationships for 80960KB Signals 
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Figure 12. Timing Relationship of L-Bus Signals 
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AC Specification Tables 

80960KB AC Characteristics (10 MHz, PQFP Only) 


Symbol 


Parameter 


Min 


Max 


Units 


Test Conditions 


Ti 


Processor Clock 
Period (CLK2) 


50 


125 


ns 


V| N = 1.5V 


T 2 


Processor Clock 
Low Time (CLK2) 


12 




ns 


V| L = 10% Point 
= 1.2V 


T 3 


Processor Clock 
High Time (CLK2) 


12 




ns 


V| H = 90% Point 
= 0.1V + 0.5 V C c 


T 4 


Processor Clock 
Fall Time (CLK2) 




10 


ns 


V| N = 90% Point to 10% 
Point 


T 5 


Processor Clock 
Rise Time (CLK2) 




10 


ns 


V, N = 10% Point to 90% 
Point 


T 6 


, Output Valid 
Delay 


2 


25 


ns 


C L = 100pF(LAD) 

C L = 75 pF (Controls)(2) 


T6H 


HOLDA Output 
Valid Delay 


4 


31 


ns 


C L = 75pF 


V 


ALE Width 


25 




ns 


C L = 75 pF 


T 8 


ALE Output Valid Delay 





20 


ns 


C L = 75 pF(2) 


T 9 


Output Float 
Delay 


2 


20 


ns 


C L = 100pF(LAD) 
C L = 75 pF (Controls) 


T9H 


HOLDA Output 
Float Delay 


4 


20 


ns 


C L = 75 pF 


T10 


Input Setup 1 


3 




ns 




T11 


Input Hold 


5 




ns 




T 11H 


HOLD Input Hold 


4 




ns 




T12 


Input Setup 2 


8 




ns 




T-13 


Setup to ALE 
Inactive 


10 




ns 


C L = 100 pF (LAD) 
C L = 75 pF (Controls) 


T 14 


Hold after ALE 
Inactive 


8 




ns 


C L = 100pF(LAD) 
C L = 75 pF (Controls) 


T15 


Reset Hold 


3 




ns 




T16 


Reset Setup 


5 




ns 




T17 


Reset Width 


1640 




ns 


41 CLK2 Periods Minimum 



NOTES: 

1. lAC/INTo, INTl INT 2 /INTR, INT3 can be asynchronous. 

2. A float condition occurs when the maximum output current becomes less than l|_p. Float delay is not tested, but should be 
no longer than the valid delay. 

3. Clock rise and fall times are not tested. 
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80960KB AC Characteristics (16 MHz) 


Symbol 


Parameter 


Min 


Max 


Units 


Test Conditions 


Ti 


Processor Clock 
Period (CLK2) 


31.25 


125 


ns 


V| N = 1.5V 


T 2 


Processor Clock 
Low Time (CLK2) 


8 




ns 


V| L = 10% Point 
= 1.2V 


T 3 


Processor Clock 
High Time (CLK2) 


8 




ns 


V| H = 90% Point 
= 0.1V + 0.5 V C c 


T 4 


Processor Clock 
Fall Time (CLK2) 




10 


ns 


Vim = 90% Point to 10% 
Point 


T 5 


Processor Clock 
Rise Time (CLK2) 




10 


ns 


Vim = 10% Point to 90% 
Point 


T 6 


Output Valid 
Delay 


2 


25 


ns 


C L = 100 pF (LAD) 
C L = 75 pF (Controls) 


T6H 


HOLDA Output 
Valid Delay 


4 


31 


ns 


C L = 75 pF 


T 7 


ALE Width 


15 




ns 


C L = 75 pF 


T 8 


ALE Output Valid Delay 





20 


ns 


C L = 75 pF(2) 


T 9 


Output Float 
Delay 


2 


20 


ns 


C L = 100pF(LAD) 

C L = 75 pF (Controls)(2) 


TgH 


HOLDA Output 
Float Delay 


4 


20 


ns 


C L = 75 pF 


T10 


Input Setup 1 


3 




ns 




T11 


Input Hold 


5 




ns 




T11H 


HOLD Input Hold 


4 




ns 




T12 


Input Setup 2 


8 




ns 




T13 


Setup to ALE 
Inactive 


10 




ns 


C L = 100 pF. (LAD) 
C L = 75 pF (Controls) 


Tu 


Hold after ALE 
Inactive 


8 




ns 


C L = 100 pF (LAD) 
C L = 75 pF (Controls) 


T15 


Reset Hold 


3 




ns 




Tie 


Reset Setup 


5 




ns 




T17 


Reset Width 


1281 




ns 


41 CLK2 Periods Minimum 




NOTES: 

1. IAC/INTq, INT 1f INT 2 /INTR, INT 3 can be asynchronous. 

2. A float condition occurs when the maximum output current becomes less than I lo- Float delay is not tested, but should be 
no longer than the valid delay. 

3. Clock rise and fall times are not tested. 
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80960KB AC Characteristics (20 MHz) 


Symbol 


Parameter 


Min 


Max 


Units 


Test Conditions 


Ti 


Processor Clock 
Period (CLK2) 


25 


125 


ns 


Vim = 1.5V 


T 2 


Processor Clock 
Low Time (CLK2) 


6 




ns 


V| L = 10% Point 
= 1.2V 


T 3 


Processor Clock 
High Time (CLK2) 


6 




ns 


V| H = 90% Point 
= 0.1V + 0.5 V C c 


T 4 


Processor Clock 
Fall Time (CLK2) 




10 


ns 


V| N = 90% Point to 10% 
Point 


T 5 


Processor Clock 
Rise Time (CLK2) 




10 


ns 


V| N = 10% Point to 90% 
Point 


T 6 


Output Valid 
Delay 


2. 


20 


ns 


C L = 60 pF (LAD) 
C L = 50 pF (Controls) 


T6H 


HOLDA Output 
Valid Delay 


4 


26 


ns 


C L = 50 pF 


T 7 


ALE Width 


12 




ns 


C L = 50 pF 


T 8 


ALE Output Valid Delay 





20 


ns 


C L = 50 pF(2) 


T 9 


Output Float 
Delay 


2 


20 


ns 


C L = 60pF(LAD) 

C L = 50 pF (Controls)(2) 


T 9H 


HOLDA Output 
Float Delay 


4 


20 


ns 


C L = 50pF 


T10 


Input Setup 1 


3 




ns 




T11 


Input Hold 


5 




ns 




T-mh 


HOLD Input Hold 


4 




ns 




ti2 


Input Setup 2 


7 




ns 




Ti3 


Setup to ALE 
Inactive 


10 




ns 


C L = 60 pF (LAD) 
Cl = 50 pF (Controls) 


T14 


Hold after ALE 
Inactive 


8 




ns 


C L = 60pF(LAD) 
Cl = 50 pF (Controls) 


T15 


Reset Hold 


3 




ns 




Tie 


Reset Setup 


5 




ns 




T17 


Reset Width 


1025 




ns 


41 CLK2 Periods Minimum 



NOTES: 

1. lAC/INTo, INTl INT 2 /INTR, INT 3 can be asynchronous. 

2. A float condition occurs when the maximum output current becomes less than l|_o Float delay is not tested, but should be 
no longer than the valid delay. 

3. Clock rise and fall times are not tested. 
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Figure 13. Test Load Circuit for 
Tri-State Output Pins 



Figure 14. Test Load Circuit for Open-Drain Output Pins 
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80960KB AC Characteristics (25 MHz, PGA Only) 


Symbol 


Parameter 


Min 


Max 


Units 


Test Conditions 


Ti 


Processor 

Clock Period (CLK2) 


20 


125 


ns 


V| N = 1.5V 


T 2 


Processor Clock 
Low Time (CLK2) 


5 




ns 


V| L = 10% Point 
= 1.2V 


T 3 


Processor Clock 
High Time 


5 




ns 


V| H = 90% Point 

= 0.1V + 0.5 V C c 


T 4 


Processor Clock 
Fall Time (CLK2) 




10 


ns 


Vim = 90% Point to 10% 
Point 


T 5 


Processor Clock 
Rise Time (CLK2) 




10 


ns 


Vin = 10% Point to 90% 
Point 


T 6 


Output Valid 
Delay 


2 


18 


ns 


C L = 60 pF (LAD) 
C L = 50 pF (Controls) 


T6H 


HOLDA Output 
Valid Delay 


4 


24 


ns 


C L = 50 pF 


T 7 


ALE Width 


12 




ns 


C L = 50 pF 


T 8 


ALE Output Valid Delay 





20 


ns 


C L = 50pF(2) 


Tg 


Output Float 
Delay 


2 


18 


ns 


C L = 60 pF (LAD) 
C L = 50 pF (Controls) 


TgH 


HOLDA Output 
Float Delay 


4 


20 


ns 


C L = 50 pF 


T10 


Input Setup 1 


3 




ns 




T11 


Input Hold 


5 




ns 




Thh 


HOLD Input Hold 


4 




ns 




T12 


Input Setup 2 


7 




ns 




T13 


Setup to ALE 
Inactive 


8 




ns 


C L = 60 pF (LAD) 
C L = 50 pF (Controls) 


T14 


Hold after ALE 
Inactive 


8 




ns 


C L = 60 pF (LAD) 
C L = 50 pF (Controls) 


T 15 


Reset Hold 


3 




ns 




Tie 


Reset Setup 


5 




ns 




T17 


Reset Width 


820 




ns 


41 CLK2 Periods Minimum 



NOTES: 

1. IAC/INT0, INT1, INT2/INTR, INT3 can be asynchronous. 

2. A float condition occurs when the maximum output current becomes less than I|_q. Float delay is not tested, but should be 
no longer than the valid delay. 

3. Clock rise and fall times are not tested. 
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Figure 15. Processor Clock Pulse (CLK2) 
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Figure 16. RESET Signal Timing 
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Figure 17. Hold Timing 



Design Considerations 



Input hold times can be disregarded by the designer 
whenever the input is removed because a subse- 
quent output from the processor is deasserted (e.g., 
DEN becomes deasserted). 

Whenever the processor generates an output that 
indicates a transition into a subsequent state, any 
outputs that are specified to be tri-stated in this new 
state are guaranteed to be tri-stated. For example, in 
the Td cycle following a T a cycle for a read, the mini- 
mum output delay of DEN is 2 ns, but th e maximum 
float time of LAD is 20 ns. When DEN is asserted, 
however, the LAD outputs are guaranteed to have 
been tri-stated. 



Designing for the ICE-960KB 

The 80960KB In-Circuit Emulator assists in debug- 
ging both 80960KA and 80960KB hardware and 
software designs. The product consists of a probe 
module, cable, and control unit. Because of the high 
operating frequency of 80960KB systems, the probe 
module connects directly to the 80960KB socket. 



When designing an 80960KB hardware system that 
uses the ICE-960KB to debug the system, several 
electrical and mechanical characteristics should be 
considered. These considerations include capacitive 
loading, drive requirement, power requirement and 
physical layout. 

The ICE-960KB probe module increases the load 
capacitance of each line by up to 25 pF. It also adds 
one standard Schottky TTL load on the CLK2 line, 
up to one advanced low-power Schottky TTL load 
for each control signal line, and one advanced low- 
power Schottky TTL load for each address/data and 
byte enable line. These loads originate from the 
probe module and are driven by the 80960KB proc- 
essor. 

To achieve high noise immunity, the ICE-960KB 
probe is powered by the user's system. The high- 
speed probe circuitry draws up to 1.1 A plus the max- 
imum current (Ice) of the 80960KB processor. 

The mechanical considerations are shown in Figure 
18, which illustrates the lateral clearance require- 
ments for the IGE-960KB probe as viewed from 
above the socket of the 80960KB processor. 
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Figure 18. ICE-960KB Lateral Clearance Requirements 



MECHANICAL DATA 



Package Dimensions and Mounting 

The 80960KB is available in two different packages: 
a 132-lead ceramic pin-grid array (PGA) and a 132- 
lead plastic quad flat pack (PQFP). Pins in the ce- 
ramic package are arranged 0.100 inch (2.54 mm) 
center-to-center, in a 1 4 by 14 matrix, three rows 
around. (See Figure 19.) The plastic package uses 
fine-pitch gull wing leads arranged in a single row 
along the perimeter of the package with 0.025 inch 
(0.64 mm) spacing. (See Figure 20.) Dimensions are 
given in Figure 21 and Table 7. 

There are a wide variety of sockets available for the 
ceramic PGA package including low-insertion or 
zero-insertion force mountings, and a choice of ter- 
minals such as soldertail, surface mount, or wire 
wrap. Several applicable sockets are shown in Fig- 
ure 22. 

The PQFP is normally surface mounted to take best 
advantage of the plastic package's small footprint 
and low cost. In some applications, however, de- 
signers may prefer to use a socket, either to improve 



heat dissipation or reduce repair costs. Figures 23a 
and 23b show two of the many sockets available. 



Pin Assignment 

The PGA and PQFP have different pin assignments. 
Figure 24 shows the view from the bottom of the 
PGA (pins facing up) and Figure 25 shows a view 
from the top of the PGA (pins facing down). Figures 
20 and 32 show the top view of the PQFP; notice 
that the pins are numbered in order from 1 to 132 
around the package's perimeter. Tables 5 and 6 list 
the function of each pin in the PGA, and Tables 8 
and 9 list the function of each pin in the PQFP. 

Vcc and GND connections must be made to multi- 
ple Vqc and GND pins. Each Vcc and GND pin must 
be connected to the appropriate voltage or ground 
and externally strapped close to the package. We 
recommend that you include separate power and 
ground planes in your circuit board for power distri- 
bution. 

NOTE: 

Pins identified as N.G., "No Connect," should never 
be connected. 



3-104 



80960KB 



Package Thermal Specification 

The 80960KB is specified for operation when case 
temperature is within the range 0°C to + 85°C (PGA) 
or +100°C (PQFP). The case temperature should 
be measured at the top center of the package as 
shown in Figure 26. 

The ambient temperature can be calculated from 0j C 
and 0j a by using the following equations: 



Tj = T C + P*0, 



jc 



T A = Tj - P*0 ja 

Tc = T A + P*[0 ja " e lc ] 

Values for 6 ]a and ]C are given in Table 10 for the 
PGA package and in Table 1 1 for the PQFP for vari- 
ous airflows. Note that the 0j a for the PGA package 
can be reduced by adding a heatsink, while a heat- 
sink is not generally used with the plastic package 
since it is intended to be surface mounted. The max- 
imum allowable ambient temperature (T A ) permitted 
without exceeding Tc is shown by the charts in Fig- 
ures 27 through 30 for 10 MHz, 16 MHz, 20 MHz, 
and 25 MHz respectively. 

The curves assume the maximum permitted supply 
current (Ice) at each speed, Vcc of 5.0V, and a 
TcASE of + 85°C (PGA) or + 1 00°C (PQFP). 

If you will be using the 80960KB in a harsh environ- 
ment where the ambient temperature may exceed 
the limits for the normal commercial part, you should 
consider using an extended temperature part. These 
parts are designed by the prefix "TA" and are avail- 
able at 16, 20 and 25 MHz in the ceramic PGA pack- 
age. The extended operating temperature range is 
-40°C to + 125°C case. Figure 30 shows the maxi- 
mum allowable ambient temperature for the 20 MHz 
extended temperature TA80960KB at various air- 
flows. The curve assumes an Ice of 420 mA, Vcc of 
5.0V, and a T C ase of + 1 25°C. 



WAVEFORMS 

Figures 33 through 38 show the waveforms for vari- 
ous transactions on the 80960KB's local bus. 



SUPPORT COMPONENTS 



85C960 Burst Bus Controller 

The Intel 85C960 performs burst logic, ready gener- 
ation, and address decode for the 80960KA and 
80960KB. The burst logic supports both standard 
and burst mode memories and peripherals. The 
ready generation and timing control supports to 1 5 
wait states across eight address ranges for read/ 
write and burst accesses. The address decoder de- 
codes eight address inputs into four external and 
four internal chip selects. The wait state and chip 
select values may be programmed by the user; the 
timing control and burst logic are fixed. 

The 85C960 operates with the 80960KA and 
80960KB at all frequencies and consumes only 50 
mA at 25 MHz. The 85C960 is housed in a 28-pin, 
300-mil ceramic DIP and plastic DIP packages or 28- 
pin PLCC package for surface mount. In the ceramic 
DIP package the part is UV-erasable, which makes it 
easy to revise designs. Order the 85C960 data sheet 
(No. 290192) for full details. 



27960KX Burst Mode EPROWl 

Intel 27960KX one-megabit EPROM is designed 
specifically to support the 80960KA and 80960KB. It 
uses a burst interface to offer near zero wait-state 
performance without the high cost of alternative 
memory technologies. The 27960KX removes the 
need for "dumping" code and data stored in slow 
EPROMs or ROMs into expensive high-speed 
"shadow" RAM. 

Internally, the 27960KX is organized in blocks of four 
bytes that are accessed sequentially. The address 
of the four-byte block is latched and incremented 
internally. After a set number of wait-states (1 or 2), 
data is output one word at a time each subsequent 
clock cycle. High-performance outputs provide zero 
wait-state data-to-data burst accesses. Extra power 
and ground pins dedicated to the output reduce the 
effect of fast output switching on the device. The 
27960KX offers 1-0-0-0 performance at 20 MHz and 
2-0-0-0 performance at 25 MHz. Full details can be 
found in the 27960KX data sheet (No. 290337). 
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Figure 19. A 132-Lead Pin-Grid Array (PGA) Used to Package the 80960KB 
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Figure 20. The 132-Lead Plastic Quad Flat Pack (PQFP) used to Package the 80960KB 
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Figure 21a. Principal Dimensions of the 132-Lead PQFP 
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Figure 21b. Details of the Molding of the 132-Lead PQFP 
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Figure 21c. Terminal Details for the 132-Lead PQFP 
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Figure 21 d. Board Footprint Area for the 132-Lead PQFP 
Table 7. Package Dimension: 80960KB PQFP 



Symbol 


Description 


Inches 


MM 


Min 


Max 


Min 


Max 


N 


Leadcount 


132 Leads 


132 Leads 


A 


Package Height 


0.160 


0.170 


4.060 


4.320 


A1 


Standoff 


0.020 


0.030 


0.510 


0.760 


D.E 


Terminal Dimension 


1.075 


1.085 


27.310 


27.560 


D1.E1 


Package Body 


0.947 


0.953 


24.050 


24.210 


D2.E2 


Bumper Distance 
Without Flash 
With Flash 


1.097 
1.097 


1.103 
1.110 


27.860 
27.860 


28.010 
28.190 


D3.E3 


Lead Dimension 


0.800 REF 


20.32 REF 


D4,E4 


Foot Radius Location 


1.023 


1.037 


25.890 


26.330 


L1 


Foot Length 


0.020 


0.030 


0.510 


0.760 
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• Low insertion force (LIF) soldertail 
55274-1 

• Amp tests indicate 50% reduction in 
insertion force compared to 
machined sockets 

Other socket options 

• Zero insertion force (ZIF) soldertail 
55583-1 

• Zero insertion force (ZIF) Burn-in 
version 55573-2 

Amp Incorporated 
(Harrisburg, PA 17105 U.S.A 
Phone 717-564-0100) 




55274-1 



55583-1 



270565-13 



Cam handle locks in low profile position when 80960KB is installed 
(handle UP for open and DOWN for closed positions). 

Courtesy Amp Incorporated 



Peel-A-Way* Mylar and Kapton 
Socket Terminal Carriers 

• Low insertion force surface 
mount CS132-37TG 

• Low insertion force soldertail 
CS1 32-01 TG 

• Low insertion force wire-wrap 
CS132-02TG (two-level) 
CS132-03TG (thee-level) 

• LoW insertion force press-fit 
CS132-05TG 

Advanced Interconnections 

(5 Division Street) 
Warwick, Rl 02818 U.S.A. 
Phone 401-885-0485) 



Peel-A-Way Carrier No. 132: 
Kapton Carrier is KS132 
Mylar Carrier is MS1 32 

Molded Plastic Body KS1 32 
is shown below: 



FOOT PRINT NO. 132 




HK100TYP 
14 x 14 x 3 ROWS 



SOLDER TAIL -01 



Ji&» 



E WRAP -02/-03 



LOW PROFILE -04 



SOLDER TAIL -33 



S-H*- 



PRESS FIT -05 




^jfWA. 



SURFACE MOUNTING -37 



>\ 



Courtesy Advanced Interconnections 

(Peel-A-Way Terminal Carriers 

U.S. Patent No. 4442938) 



* Peel-A-Way is a trademark of Advanced Interconnections. 



Figure 22. Several Socket Options for Mounting the 80960KB 
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Figure 23a. AMP Micropitch Socket for the 132-Lead Plastic 
Quad Flat Pack, 0.025" Lead Spacing, Gull Wing Leads 
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Part Number: 

2-0132-07244-000-018007 



270565-46 



Figure 23b. 3M Company PQFP Socket and Lid 



3-112 



iny. 



80960KB 





1 


2 


3 


4 


5 


6 


7 8 


9 


10 


11 


12 


13 


14 




p 


O 


o 


O 


O 


O 


O 


O O 


O 


O 


O 


O 


O 


O 


P 




v C c 


N.C. 


N.C. 


N.C. 


N.C. 


N.C. 


N.C. N.C. 


N.C. 


N.C. 


N.C. 


N.C. 


v ss 


V CC 




N 


o 


O 


O 


O 


O 


O 


O 


O 


O 


O 


O 





O 


N 




v S s 


N.C. 


N.C. 


N.C. 


N.C. 


N.C. 


N.C. N.C. 


N.C. 


N.C. 


N.C. 


N.C. 


N.C. 


N.C. 




M 





O 


O 


O 


O 


O 


O 


O 


O 


O 


O 


O 


O 


M 




N.C. 


V CC 


v S s 


v ss 


V CC 


N.C. 


N.C. N.C. 


N.C. 


v ss 


v cc 


N.C. 


N.C. 


N.C. 




L 


O 


O 



















O 


O 


O 


L 


















DEN 


N.C. 


v C c 
















Vss 


N.C. 


N.C. 




K 


O 

BE 3 


O 

FAIL 





















Vcc 


O 

N.C. 


O 

N.C. 


K 


J 


o 

DTR 


O 

BE 2 





















N.C. 


O 

N.C. 


O 

N.C. 


J 


H 




WR 


o 

BEo 




LOCK 








80960KB 










N.C. 


O 

N.C. 


O 

N.C. 


H 


G 


O 

LADjo 


o 

READY 




BE, 
















O 

N.C. 


O 

N.C. 


O 

N.C. 


G 


F 


o 

LAD 29 


O 


O 

CACHE 
















O 

N.C. 


O 

N.C. 


O 

N.C. 


F 


E 


O 

LAD 28 


O 

LAD 26 


O 

LAD„ 
















O 

N.C. 


O 

v ss 


O 

N.C. 


E 


D 


o 

ALE 


O 

ADS 


O 

HLDA 
















O 

v cc 




N.C. 


O 

N.C. 


D 
















C 


O 


O 


O 





O 


O 


O O 

















O 


C 




HOLD 


LAD 25 


BADAC 


v cc 


v S s 


LAD 20 


LAD, 3 LAD 8 


LADj 


Vcc 


v ss 


INT3 


INTj 


INT 




B 


O 


O 











O 


O O 


O 











O 


O 


B 




LAD 23 


LAD 24 


LAD 22 


LAD 21 


LAD 18 


LAD 15 


LAD 12 LAD, 


LAD 6 


LAD 2 


CLK 


LAD 


RESET 


V SS 




A 


o 


O 


O 


O 


O 


O 


O O 


O 


O 


O 





O 


O 


A 




Vcc 


V SS 


LAD 19 


LAD 17 


LAD 16 


LAD U 


LA^ 1 LAD g 


LADy 


LAD 5 


LAD 4 


LA^ 


INT 2 


V CC 






1 


2 


3 


4 


5 


6 


7 8 


9 


10 


11 


12 


13 


14 

270 


565-10 



Figure 24. 80960KB PGA Pinout— View from Bottom (Pins Facing Up) 
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Figure 25. 80960KB PGA Pinout— View from Top (Pins Facing Down) 
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Table 5. 80960KB PGA Pinout—ln Pin Order 


Pin 


Signal 


Pin 


Signal 


Pin 


Signal 


Pin 


Signal 


A1 


Vcc 


C6 


LAD 20 


H1 


W/R 


M10 


VSS 


A2 


Vss 


C7 


LAD! 3 


H2 


BEo 


M11 


Vcc 


A3 


LAD 19 


C8 


LAD 8 . 


H3 




M12 


N.C. 


LOCK 


A4 


LAD 17 


C9 
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M13 


N.C. 
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LAD 16 


C10 


Vcc 


H13 
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Table 6. 80960KB PGA Pinout— In Signal Order 


Signal 


Pin 


Signal 


Pin 


Signal 


Pin 


Signal 


Pin 


ADS 


D2 


LAD 15 


B6 


N.C. 


J14 
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ALE 
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LAD 16 
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MEASURE PGA CASE TEMPERATURE 

f AT CENTER OF TOP SURFACE 




MEASURE PQFP TEMPERATURE AT 

CENTER OF TOP SURFACE 
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Figure 26. Measuring 80960KB PGA and PQFP Case Temperature 
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Figure 27. 10 MHz 80960 K-Series Maximum Allowable Ambient Temperature 
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Figure 28. 16 MHz 80960 K-Series Maximum Allowable Ambient Temperature 
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■ PQFP DPGA with no ♦PGA with omni- OPGA with uni- 

heatsink directional heatsink directional heatsink 
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Figure 29. 20 MHz 80960 K-Series Maximum Allowable Ambient Temperature 
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Figure 30. Maximum Allowable Ambient Temperature for 
the 80960KB at 25 MHz (available in PGA only) 
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Figure 31. Maximum Allowable Ambient Temperature for the Extended 
Temperature TA-80960KB at 20 MHz (available in PGA only) 
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116 




LAD 15 


- 


117 




LAD 16 


- 


118 




LAD17 




119 




LAD 18 




120 




LAD 19 




121 




LAD20 




122 




LAD21 


— 


123 




LAD22 


- 


124 




V SS 


- 


125 




LAD23 


- 


126 




LAD24 




127 




LAD25 




128 




BADAC 


- 


129 




HOLD/HLDAR 




130 




NC 
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Figure 32. 80960KB PQFP Pinout— View from Top 
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Table 8. 80960KB Plastic Package Pinout— In Pin Order 


Pin 


Signal 


Pin 


Signal 


Pin 


Signal 


Pin 


Signal 


1 


HLDA/HOLR 


34 


N.C. 


67 


v ss 


100 


LAD0 


2 


ALE 


35 


Vcc 


68 


Vss 


101 


LAD1 


3 


LAD26 


36 


Vcc 


69 


N.C. 


102 


LAD2 


4 


LAD27 


37 


N.C. 


70 


Vcc 


103 


Vss 


5 


LAD28 


38 


N.C. 


71 


Vcc 


104 


LAD3 


6 


LAD29 


39 


N.C. 


72 


N.C. 


105 


LAD4 


7 


LAD30 


40 


N.C. 


73 


v ss 


106 


LAD5 


8 


LAD31 


41 


Vcc 


74 


Vcc 


107 


LAD6 


9 


Vss 


42 


Vss 


75 


N.C. 


108 


LAD7 


10 


CACHE 


43 


N.C. 


76 


N.C. 


109 


LAD8 


11 


W/R 


44 


N.C. 


77 


N.C. 


110 


LAD9 


12 




45 


N.C. 


78 


N.C. 


111 


LAD10 


READY 


13 


DT/R 


46 


N.C. 


79 


Vss 


112 


LAD11 


14 


BEO 


47 


N.C. 


80 


Vss 


113 


LAD12 


15 


BET 


48 


N.C. 


81 


N.C. 


114 


v ss 


16 


BE2 


49 


N.C. 


82 


Vcc 


115 


LAD13 


17 


BE3 


50 


N.C. 


83 


Vcc 


116 


LAD14 


18 




51 


N.C. 


84 


Vss 


117 


LAD15 


FAILURE 


19 


v ss 


52 


Vss 


85 


iac/INTo 


118 


LAD16 


20 




53 


v ss 


86 


INT1 


119 


LAD17 


LOCK 


21 


DEN 


54 


N.C. 


87 


INT2/INTR 


120 


LAD18 


22 


Vss 


55 


Vcc 


88 


INT3/INTA 


121 


LAD19 


23 


V SS 


56 


Vcc 


89 


N.C. 


122 


LAD20 


24 


N.C. 


57 


Vss 


90 


v ss 


123 


LAD21 


25 


N.C. 


58 


N.C. 


91 


CLK2 


124 


LAD22 


26 


v ss 


59 


N.C. 


92 


Vcc 


125 


Vss 


27 


V SS 


60 


N.C. 


93 


RESET 


126 


LAD23 


28 


N.C. 


61 


N.C. 


94 


N.C. 


127 


LAD24 


29 


Vcc 


62 


N.C. 


95 


N.C. 


128 


LAD25 


30 


Vcc 


63 


N.C. 


96 


N.C. 


129 




BADAC 


31 


N.C. 


64 


N.C. 


97 


N.C. 


130 


HOLD/HLDAR 


32 


Vss 


65 


N.C. 


98 


N.C. 


131 


N.C. 


33 


Vss 


66 


N.C. 


99 


v ss 


132 


ADS 
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Table 9. 80960KB Plastic Package Pinout— In Signal Order 


Signal 


Pin 


Signal 


Pin 


Signal 


Pin 


Signal 


Pin 


ADS 


132 


LAD22 


124 


N.C. 


49 


Vcc 


41 


ALE 


2 


LAD23 


126 


N.C. 


50 


Vcc 


55 




129 


LAD24 


127 


N.C. 


51 


v cc 


56 


BADAC 


BEO 


14 


LAD25 


128 


N.C. 


54 


Vcc 


70 


BET 


15 


LAD26 


3 


N.C. 


58 


Vcc 


71 


BE2 


16 


LAD27 


4 


N.C. 


59 


Vcc 


74 


BE3 ' 


17 


LAD28 


5 


N.C. 


60 


v cc 


82 


CACHE 


10 


LAD29 


6 


N.C. 


61 


Vcc 


83 


CLK2 


91 


LAD3 


104 


N.C. 


62 


Vcc 


92 


DEN 


21 


LAD30 


7 


N.C. 


63 


Vss 


9 


DT/R 


13 


LAD31 


8 


N.C. 


64 


v S s 


19 




18 


LAD4 


105 


N.C. 


65 


Vss 


22 


FAILURE 


HLDA/HOLR 


1 


LAD5 


106 


N.C. 


66 


Vss 


23 


HOLD/HLDAR 


130 


LAD6 


107 


N.C. 


69 


Vss 


26 


Iac/inTo 


85 


LAD7 


108 


N.C. 


72 


Vss 


27 


INT1 


86 


LAD8 


109 


N.C. 


75 


Vss 


32 


INT2/INTR 


87 


LAD9 


110 


N.C. 


76 


Vss 


33 


INT3/INTA 


88 




20 


N.C. 


77 


Vss 


42 


LOCK 


LADO 


100 


N.C. 


24 


N.C. 


78 


V SS 


52 


LAD1 


101 


N.C. 


25 


N.C. 


81 


Vss 


53 


LAD10 


111 


N.C. 


28 


N.C. 


89 


Vss 


57 


LAD11 


112 


N.C. 


31 


N.C. 


94 


Vss 


67 


LAD12 


113 


N.C. 


34 


N.C. 


95 


V SS 


68 


LAD13 


115 


N.C. 


37 


N.C. 


96 


v ss 


73 


LAD14 


116 


N.C. 


38 


N.C. 


97 


Vss 


79 


LAD15 


117 


N.C. 


39 


N.C. 


98 


Vss 


80 


LAD16 


118 


N.C. 


40 


N.C. 


131 


Vss 


84 


LAD17 


119 


N.C. 


43 




12 


Vss 


90 


READY 


LAD18 


120 


N.C. 


44 


RESET 


93 


Vss 


99 


LAD19 


121 


N.C. 


45 


Vcc 


29 


Vss 


103 


LAD2 


102 


N.C. 


46 


Vcc 


30 


v ss 


114 


LAD20 


122 


N.C, 


47 


Vcc 


35 


Vss 


125 


LAD21 


123 


N.C. 


48 


Vcc 


36 


W/R 


11 
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Table 10. 80960KB PGA Package Thermal Characteristics 



Thermal Resistance— °C/Watt 


Parameter 


Airflow— ft./min (m/sec) 



(0) 


50 
(0.25) 


100 
(0.50) 


200 
(1-01) 


400 
(2.03) 


600 
(3.04) 


800 
(4.06) 


Junction-to-Case 

(Case Measured 

as shown in Figure 26) 


2 


2 


2 


2 


2 


2 


2 


6 Case-to-Ambient 
(No Heatsink) 


19 


18 


17 


15 


12 


10 


9 


Case-to-Ambient 
(with Omnidirectional 
Heatsink) 


16 


15 


14 


12 


9 


7 


6 


Case-to-Ambient 
(with Unidirectional) 
Heatsink) 


15 


14 


13 


11 


8 


6 


5 



irftxsx 
1 ^^ 



1*J cap 



m 
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NOTES: 

1. This table applies to 80960KB PGA 3. 0j.cap = 4 °C/w (approx.) 
plugged into socket or soldered di- 0j-pin = 4°C/w (inner pins) (approx.) 
rectly into board. 0J-PIN = 8°C/w (outer pins) (approx.) 

2- 0JA = 0JC + 0CA- 



Table 11. 80960KB PQFP Package Thermal Characteristics 



PQFP Thermal Resistance— °C/ Watt 


Parameter 


Airflow — ft./min (m/sec) 



(0) 


50 
(0.25) 


100 
(0.50) 


200 
(1.01) 


400 
(2.03) 


600 
(3.04) 


800 
(4.06) 


Junction-to-Case 

(Case Measured 

as shown in Figure 26) 


9 


9 


9 


9 


9 


9 


9 


Case-to-Ambient 
(No Heatsink) 


22 


19 


18 


16 


11 


9 


8 



NOTES: 

1. This table applies to 80960KB 3. JL = 18°C/Watt 
PQFP soldered directly into board. 0jb = 18°G/Watt 

2. 0j A = 0JC + 0CA- 
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Figure 33. Read Transaction 
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Figure 34. Write Transaction with One Wait State 
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Figure 35. Burst Read Transaction 
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Figure 36. Burst Write Transaction with One Wait State 
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Figure 37. Interrupt Acknowledge Transaction 
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Figure 38. Bus Exchange Transaction (PBW1 = Primary Bus Master, SBM = Secondary Bus Master) 
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80960CA PRODUCT OVERVIEW 



1.0 PURPOSE 

The 80960CA Product Overview is a summary of the 
features and operation of Intel's 80960CA Embedded 
Processor. The Product Overview is intended for those 
who are not familiar with the 80960 architecture or the 
80960CA, a product built around this architecture. The 
80960GA Product Overview provides a programmer or 
a system designer with a quick, global view of software 
and hardware design considerations for the 80960CA. 
For further information, refer to the following refer- 
ence documents: 

— The 80960CA User's Manual contains detailed tech- 
nical information and examples for designing em- 
bedded systems using the 80960CA. 

— The 80960CA Data Sheet provides electrical specifi- 
cations for the device, such as the DC and AC pa- 
rameters, operating conditions, and packaging spec- 
ifications. 



2.0 80960CA 32-BIT EMBEDDED 
PROCESSOR 

The 80960CA (Figure 2-1) is optimized for embedded 
processing applications. This product features the high- 
performance C-Series core plus built-in system periph- 
erals, effectively integrating a high-speed CPU and sys- 
tem components onto a single silicon die. The 80960CA 
is a member of Intel's 80960 embedded processor fami- 
ly. Each member of the 80960 family is based on a 
common architectural definition referred to as the core 
architecture. 

An 80960 family member, such as the 80960CA, is 
made up of an implementation of the core architecture 
plus application-specific extensions. These extensions 
may consist of integrated peripherals, instruction-set 
extensions, or additional registers and caches beyond 
those defined by the architecture. The common core 
architecture provides a basis for code compatibility for 
all 80960 family products, while application-specific ex- 
tensions optimize a particular product for a class of 
applications. 

The 80960 architectural target is the execution of mul- 
tiple instructions per clock (i.e., fractional clocks per 
instruction). By defining an architecture which sup- 
ports parallel instruction execution and out-of-order in- 
struction execution, performance advances are not con- 
strained by the system clock. 

The 80960CA is capable of launching and executing 
instructions in parallel. This is accomplished by the use 
of advanced silicon technology as well as innovative 
"microarchitectural" constructs. The term microarchi- 



tecture refers to the implementation of the instruction 
set and programming resources. For example, different 
microarchitectures may have different pipeline con- 
struction, internal bus widths, register set porting, de- 
grees of parallelism, and cache parameterization (two- 
way, four- way, etc.). 

A principal objective of the 80960 architecture is to 
provide the framework to allow microarchitectural ad- 
vances to translate directly into increased performance 
without architectural limitations. 
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Figure 2-1. 80960CA 



2.1 80960 Architecture 

Embedded applications are cost sensitive, require a dif- 
ferent mix of instructions than reprogrammable appli- 
cations, have demanding interrupt response require- 
ments, and often use real-time executives rather than 
full-blown operating systems. The 80960 architecture 
was developed with these factors in mind. Several key 
optimizations which are provided by the architecture 
are explained below. 

Instruction Set: Powerful Boolean operations are pro- 
vided. Frequently executed functions are available as 
single instructions for greater code density and per- 
formance. Call, Return, Compare-and-Branch, Condi- 
tional-Compare, Compare-and-Increment or Decre- 
ment, and Bit-Field-Extract are each single instruc- 
tions. 

Interrupts: A priority interrupt structure simplifies the 
management of real-time events. With 3 1 discrete levels 
of priority and 248 possible interrupt-handling proce- 
dures, this structure provides the low latency and high 
throughput interrupt handling required in embedded 
processor applications. 
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Faults: A generalized fault-handling mechanism simpli- 
fies the task of detecting errant arithmetic calculations 
or other conditions that typically require a significant 
amount of in-line user code. 

Application-Specific Extensions: The core architecture 
is designed to accept application-specific extensions 
such as instruction set extensions (e.g., string functions, 
floating point), special purpose registers, larger caches, 
on-chip program and data memory, a memory manage- 
ment and protection unit, fault-tolerance support, mul- 
tiprocessing support, and real-time peripherals (DMA, 
serial ports, etc.). 



2.2 80960 C-Series Core 

The C-series core is an implementation of the 80960 
core architecture. The core can execute instructions at 
a sustained speed of 66 MIPS(i) with bursts of perform- 
ance up to 99 MIPS. To achieve this level of perform- 
ance, Intel has incorporated state-of-the-art silicon 
technology and innovative microarchitectural con- 
1 structs into the C-Series core. Factors which contribute 
to the core's performance are listed below. 

— Parallel instruction decoding allows the 80960CA 
to start two instructions in every clock, with bursts 
of three instructions per clock. 

— Most instructions execute in a single clock cycle. 

— Multiple independent execution units enable over- 
lapping instruction execution. 

— Advanced silicon technology allows operation with 
a 33 MHz internal clock. 

— Efficient instruction pipeline is designed to mini- 
mize pipeline break losses. 

— Register and resource scoreboarding transparently 
manage parallel execution. 

— Branch look-ahead feature enables branches to exe- 
cute in parallel with other instructions. 

— Local register cache is integrated on-chip. 

— 1 Kbyte two-way set associative instruction cache is 
integrated on-chip. 

— 1 Kbyte Static Data RAM is integrated on-chip. 

These factors combine to make the 80960CA an ultra- 
high performance computing engine. 

NOTE: 

1. Single clock instructions at 33 MHz. 



2.3 80960CA System Peripherals 

The 80960CA features several extensions to the core 
architecture in the form of integrated peripherals. 
These peripherals are intended to reduce the external 
system requirements needed for embedded applications. 
These peripherals are described below. 



Bus Controller Unit: A 32-bit high-performance bus 
controller interfaces the 80960CA to external memory 
and peripherals. The bus controller transfers instruc- 
tions or data at a maximum rate of 132 Mbytes per 
second.(2) Internally programmable wait states and 16 
separately configurable memory regions allow the bus 
controller to interface with a variety of memory subys- 
tems with minimum system complexity and maximum 
performance. 

DMA Controller: A four channel DMA controller per- 
forms high speed data transfers between peripherals 
and memory. The DMA controller provides advanced 
features such as data chaining, byte assembly and disas- 
sembly, and a fly-by mode capable of transfer speeds of 
up to 66 Mbytes per second. The DMA controller fea- 
tures a performance and flexibility which is only possi- 
ble by integrating the DMA controller and the 
80960CA core. 

Interrupt Controller: A priority interrupt controller 
manages 8 external interrupt inputs, 4 internal inter- 
rupt sources from the DMA controller, and a single 
non-maskable interrupt input (NMI). A total of 248 
external interrupt sources are supported by the inter- 
rupt controller by configuring the 8 external interrupt 
pins as an 8-bit input port. The interrupt controller pro- 
vides the mechanism for the low latency and high 
throughput interrupt service featured by the 80960CA. 
The interrupt latency for the 80960CA is typically less 
than 1 jlis. 



3.0 EXECUTION ENVIRONMENT 

The Execution Environment (Figure 3-1) refers to the 
resources which are available for executing code on the 
80960CA. The following sections describe the elements 
of the execution environment. 



3.1 Registers and Literals 

The 80960CA provides four types of working data reg- 
isters: Global Registers, Local Registers, Special Func- 
tion Registers (SFRs), and Control Registers. 

Global and local registers are general purpose 32-bit 
data registers. The SFRs and the control registers pro- 
vide a programmer's interface to the on-chip peripher- 
als (i.e., the DMA controller, interrupt controller, and 
bus controller). 

NOTE: 

2. 33 MHz internal clock, load or instruction fetch on 
wait state, pipelined burst bus. 
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Figure 3-1. Execution Environment 



The 80960 architecture is a register-oriented architec- 
ture. That is, operands and results of instructions are 
placed in working data registers rather than in memory. 
Since the architecture is register oriented, an ample 
supply of registers is provided. The architecture's work- 
ing register set consists of 16, 32-bit global registers and 
16, 32-bit local registers. 

3.1.1 GLOBAL AND LOCAL REGISTERS 

The procedure call and return mechanism, which is 
part of the 80960 architecture, inspires the names given 
to the local and global registers. When a procedure call 
or return is executed, the contents of global registers 
are preserved across procedure boundaries. In other 
words, the same set of global registers is used for each 
procedure. A new set of local registers, however, is allo- 
cated for each procedure. The 80960's call and return 
mechanism is explained in Section 3.8. 

The 80960CA supplies 16, 32-bit global registers desig- 
nated gO through gl5. Registers gO through gi4 are 
general purpose global registers. Register gl5 is re- 
served for the current Frame Pointer. This register is 
available in assembly language as the fp register. The fp 
contains the address of the first byte in the current 
stack frame. The fp register and the stack frame are 
described in Section 3.8. 



The 80960CA supplies 16, 32-bit Local Registers desig- 
nated rO through rl5. Registers r3 through rl5 are gen- 
eral purpose local registers. Registers rO, rl, and r2 are 
reserved for special functions as follows: rO contains the 
Previous Frame Pointer, rl contains the Stack Pointer, 
and r2 is reserved for the Return Instruction Pointer. 
These registers are available in assembly language as, 
respectively, the pfp, sp, and rip registers. The pfp, sp, 
and rip registers manage stack frame linkage for the 
80960's procedure call and return mechanism. The 
function of these registers is decribed in Section 3.8. 

3.1.2 SPECIAL FUNCTION REGISTERS AND 
CONTROL REGISTERS 

The 80960CA uses 3 Special Function Registers (SFRs) 
for communicating with on-chip peripherals. These 
SFR's are an architectural extension specific to the 
80960CA. The SFRs on the 80960CA are designated as 
sfO, sfl, and sf2. SFRs are accessed as source operands 
by most of the 80960CA's instructions. The registers 
serve as part of the programmer's interface to the 
DMA and interrupt controller. 
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Control registers, like SFRs are used to communicate 
with the on-chip peripherals. Configuration informa- 
tion for the peripherals is generally stored in these reg- 
isters. Control registers can only be accessed by using 
the system control (sysctl) instruction. The sysctl 
instruction is used to load the internal control register 
from a table in external memory called the control ta- 
ble. In order to simplify the process of peripheral con- 
figuration, the control registers are automatically load- 
ed from this table at initialization. 

3.1.3 LITERALS 

The 80960CA provides literals which may be used in 
the place of source register operands in most instruc- 
tions. The literals range from to 31 (5 bits). When a 
literal is used as an operand, the processor expands it to 
32 bits by adding leading zeros. If the instruction de- 
fines an operand larger than 32 bits, the processor zero 
extends the literal to the operand size. 



3,2 Address Space and Memory 

The address space of the 80960CA (Figure 3-2) is con- 
sidered a subset of the execution environment since the 
code, data, data structures, and external peripherals for 
the processor reside here. The 80960 family has an ad- 
dress space which is 2 32 bytes (4 Gbytes) in size. This 
address space is linear (unsegmented); therefore, code, 
data, and peripherals may be placed anywhere in the 
usable space. For the 80960CA, some memory loca- 
tions are reserved or are assigned special functions as 
shown in Figure 3-2. 



3.2.1 INTERNAL DATA RAM 

The 80960CA provides 1 Kbyte of internal static RAM 
for fast access of frequently used data. The data RAM 
allows time critical data storage and retrieval, with no 
dependence on the performance of the external bus. 
Any load or store, including quad-word 



ADDRESS 






0000 0000H 

0000 003FH 
0000 0040H 

0000 00BFH 
0000 00C0H 

0000 00FFH 
0000 0100H 

0000 03FFH 
0000 0400H 

FEFF FFFFH 
FF00 0000H 

< 

FFFF FEFFH 
FFFF FF00H 

FFFF FF2CH 
FFFF FF2DH 

FFFF FFFFH 


Interrupt Vectors (optional) 
(Internal SRAM) 



64 

192 
256 

1024 

2 32 -1(4 Gbytes) 


DMA Registers (optional) 
(Internal SRAM) 


Data RAM (Internal SRAM, 
User Write Protected) 


Data RAM (Internal SRAM, 

Programmable User 

Write Protection) 


Code/Data 

Architecturally 

* Defined Data ^ 

Structures 

(External Memory) 


^ Reserved ^ 


Initialization Boot Record 
(External Memory) 


Reserved 
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Figure 3-2. Address Space 
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operations, execute in a single clock cycle when direct- 
ed to internal data RAM. The data RAM is located at 
address 00H in the processor's address space. When the 
DMA controller is in use, 32 bytes of data RAM are 
reserved for each active DMA channel. Additionally, 
64 bytes of data RAM are reserved for 16 interrupt 
vectors which may be cached internally to reduce inter- 
rupt latency. The data RAM reserved for the DMA 
controller and the interrupt controller can be used for 
additional data storage when these peripherals are not 
used. 

Two execution modes are possible on the 80960CA, 
user mode or supervisor mode. These modes are used to 
implement a protection model in which system data 
structures are isolated from user code. As shown in 
Figure 3-2, the first 256 bytes of data RAM are always 
write protected when a program is executing in user 
mode but may always be written when executing in 
supervisor mode. The remainder of the data RAM can 
be programmed for. this protection feature. The user 
and supervisor modes are described further in Section 
3.7. 

3.2.2 RESERVED ADDRESS SPACE 

The upper 16 Mbytes of memory (FF000000H- 
FFFFFFFFH) are reserved for specific functions and 
extensions to the 80960 architecture/The 12 words in 
reserved space (FFFFFF00H-FFFFFF2CH) are used 
to start up the processor when it comes out of reset. 
These 12 words are called the initialization boot record. 



3.3 Memory Addressing Modes 

The 80960CA offers a variety of modes for memory 
addressing. The addressing modes available are summa- 
rized in Table 3-1. 

Absolute addressing is used to reference an address as 
an offset from address of the processor's address 
space. At the machine level, absolute addressing may be 
implemented in one of two ways depending on the size 
of the absolute offset from address 0. Two instruction 
formats, MEMA and MEMB, are used to provide abso- 
lute addressing modes. For the MEMA format, the off- 
set is an ordinal number ranging from to 2048. For 
the MEMB format, the offset is an integer (called a 
displacement) ranging from — 2 3i — 1 to 2 31 . An assem- 
bler will choose the MEMA or MEMB format based on 
the size of the offset. 

Register-indirect addressing modes use a 32-bit ordinal 
value in a register as the base for the address calcula- 
tion. Offsets and indexes are added to this address base 
depending on the particular addressing mode. The 
register-indirect-with-index addressing mode adds a 
scaled index to the address base. The index is specified 
as a value in a register. The scale value may be selected 
as 1,2, 4, 8, or 16. 

The index-with-displacement addressing mode uses a 
scaled index plus an integer displacement. No address 
base is used in this address calculation. 



3.2.3 ARCHITECTURALLY DEFINED DATA 
STRUCTURES 

To execute a program on the 80960CA, data structures 
specific to the 80960 architecture must reside in the 
processor's address space. Architecture-defined data 
structures include stacks, initialization structures, and 
various procedure entry tables. These data structures 
may generally be located anywhere in the address 
space. Pointers to each data structure are specified 
when the 80960CA is initialized. The architecture-de- 
fined data structures include: 

— User Stack 



— Interrupt Table 

— System-Procedure 
Table 

— Fault Table 



— Interrupt Stack 

— Supervisor Stack 



In addition to the data structure defined by the archi- 
tecture, the 80960CA requires several implementation- 
specific data structures which are used for configuring 
peripherals and initialization. These data structures in- 
clude: 

— Control Table 

— Process Control Block 

— Initialization Boot Record 

Each data structure will be explained in more detail 
later in this product overview. 



The IP-mth-displacement addressing mode is used with 
load and store instructions to make them IP relative. In 
this mode, an integer displacement plus a constant of 8 
is added to the IP of the instruction to calculate the 
next address. 

Table 3-1. Memory Addressing Modes 



Mode 


Description 


Absolute Offset 


Offset 


Absolute Displacement 


Displacement 


Register Indirect 


Abase 


Register Indirect with 
Offset 


Abase + Offset 


Register Indirect with 
Index. 


Abase + (Index* Scale) 


Register Indirect with 
Index and Displacement 


Abase + (Index* Scale) 
+ Displacement 


Index with Displacement 


(lndex*Scale) + 
Displacement 


Register Indirect with 
Displacement 


Abase + Displacement 


IP with Displacement 


IP + Displacement + 8 
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3.4 Data Types 

The 80960CA operates on the following data types (Figure 3-3): 

— Integer (8, 16, 32, and 64 bits) 

— Ordinal (8, 16, 32, and 64 bits) 

— Bit 

— Bit Field 

— Triple Word (96 bits) 

— Quad Word (128 bits) 



([bitfield] I 



8 
BITS 



| BYTE | 



31 1 | 7 




LENGTH (1 TO 32 BITS) ^ ™* | SHORT | 


15 


62 1 word! 


31 


Tel LONG| 




p£l I I TRIPLE WORD | 




£%\ | | | QUADWORD| 
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Class 


Data Type 


Length 


Range 


Numeric 
(Integer) 


Byte Integer 
Short Integer 
Integer 
Long Integer 


8 bits 
16 bits 
32 bits 
64 bits 


-27 to 27- 1 
-2l5t0 2l5-1. 
-231 to 231 - 1 
-263 to 263- 1 


Numeric 
(Ordinal) 


Byte Ordinal 
Short Ordinal 
Ordinal 
Long Ordinal 


8 bits 
16 bits 
32 bits 
64 bits 


to 28 - 1 
to 216 - 1 
to 232-1 
to 264 - 1 


Non-Numeric 


Bit 

Bit Field 
Triple Word 
Quad Word 


1-bit 

1 -32 bits 

96 bits 

128 bits 


N/A 




Figure 3-3. Data Types 
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The following sections describe the data types support- 
ed by the 80960CA. 

3.4.1 NUMERIC DATA TYPES 

Integers and ordinals are considered numeric data 
types since the processor performs arithmetic opera- 
tions with this data. The integer data type is a signed 
binary value in standard 2's complement representa- 
tion. The ordinal data type is an unsigned binary value. 

3.4.2 NON-NUMERIC DATA TYPES 

The remaining data types (bit field, triple word, and 
quad word) represent groupings of bits or bytes that the 
processor can operate on as a whole, regardless of the 
nature of the data contained in the group. These data 
types facilitate the moving of blocks of bits or bytes. 

3.5 Instruction Set 

The 80960CA features a comprehensive instruction set 
(Table 3-2). Much of the instruction set is that of a 
RISC architecture. Unlike pure RISC machines, how- 
ever, the 80960CA provides an extension to the RISC 
instruction set with instructions that perform complex 
functions such as procedure calls and returns, high- 
speed multiplies, and other complex control, arithme- 
tic, and logical operations. The instruction set allows 
functionally complex yet highly compact code to be 
written for embedded control applications where mem- 
ory is a valuable commodity. 

3.5.1 INSTRUCTION GROUPS 

The 80960CA instruction set is most easily described if 
grouped by the functions listed below: 

— Data Movement 

— Address Computation > 

— Logical and Arithmetic 

— Bit and Bit Field 

— Comparison 

— Branch 

— Call and Return 

— Fault 

— Debug 

— Processor Management 

The instructions which make up each of these groups 
are described in the following sections. 



3.5.1.1 Data Movement Instructions 

The data movement instructions move data from mem- 
ory to registers, from registers to memory, and between 
registers. The load instructions copy bytes, words, or 
multiple words from memory to a selected register or 
group of registers. Conversely, the store instructions 
copy bytes, words, or groups of words from a selected 
register or group of registers to memory. The move in- 
structions copy data between registers. 

Load Instructions 



Id 


load word 


Idob 


load ordinal byte 


Idos 


load ordinal short 


Idib 


load integer byte 


Idis 


load integer short 


Idl 


load long 


Idt 


load triple 


Idq 


load quad 


Store Instructions 


St 


store word 


stob 


store ordinal byte 


stos 


store ordinal short 


stib 


store integer byte 


stis 


store integer short 


stl 


store long 


stt 


store triple 


stq 


store quad 


Move Instructions 


mov 


move word 


movl 


move long 


movt 


move triple 


movq 


move quad 



3.5.1.2 Address Computation Instructions 

The load address (Ida) instruction causes a 32-bit ad- 
dress to be computed and placed in a destination regis- 
ter. The address is computed based on the addressing 
mode selected. The load and store instructions perform 
a function identical to that of the Ida instruction when 
calculating a source or destination address. The Ida in- 
struction is useful for loading a 32-bit constant into a 
register. 



3.5.1.3 Logical and Arithmetic Instructions 

Logical instructions perform bitwise Boolean opera- 
tions on operands in registers. Since this group of in- 
structions performs only bitwise manipulations of data, 
separate logical instructions for integer and ordinal 
data types do not exist. In the table below, srcl and 
src2 represent processor registers or literals which are 
the operands for these instructions. 
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Table 3-2. Instruction Set Summary 



Data 
Movement 


Arithmetic 


Logical 


Bit and 
Bit Field 


Load 
Store 
Move 


Add 

Subtract 

Multiply 

Divide 

Remainder 

Modulo 

Shift 
Extended 

Shift 
Extended 

Multiply 
Extended 

Divide 
Add with 

Carry 
Subtract with 

Carry 


And 

Not And 
And Not 
Or 

Exclusive Or 
Not Or 

Or Not 

Nor 

Exclusive Nor 

Not 

Nand 

Rotate 


Set Bit 
Clear Bit 
Not Bit 
Check Bit 
Alter Bit 
Scan for Bit 
Scan for Byte 
Span over Bit 
Extract 
Modify 


Comparison 


Branch 


Call and 
Return 


Fault 


Compare 
Condition 

Compare 
Compare and 

Increment 
Compare and 

Decrement 
Condition Test 


Unconditional 

Branch 
Conditional 

Branch 
Branch and 

Link 
Condition 

Compare 

and Conditional 

Branch 


Call 

Call Extended 

Call System 

Return 


Conditional 

Fault 
Synchronize 

Faults 


Debug 


Processor 
Management 


Address 
Computation 


Atomic 


Modify Trace 

Controls 
Mark 
Force Mark 


Modify 

Process 

Controls 
Modify 

Arithmetic 

Controls 
System Control 
Update DMA 
Setup DMA 
Flush Local 

Registers 


Load Address 


Atomic Add 
Atomic Modify 
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Logical Instructions 



-and 


srd and src2 


- notand 


srd and (not src2) 


- andnot 


(not srd ) and src2 


-or 


srd orsrc2 


- notor 


srd or (hot src2) 


- ornot 


(notsrd)orsrc2 


- xor 


srd xor src2 


-xnor 


srd xnor src2 


-nor 


not (srd or src2) 


-nand 


not (srd and src2) 


-not 


not (srd) 



Arithmetic instructions perform add, subtract, multi- 
ply, divide, and shift operations on integer or ordinal 
operands in registers. 

Arithmetic Instructions 



addi 


add integer 


addo 


add ordinal 


subi 


subtract integer 


subo 


subtract ordinal 


muli 


multiply integer 


mulo 


multiply ordinal 


divi 


divide integer 


divo 


divide ordinal 


remi 


remainder integer 


remo 


remainder ordinal 


modi 


modulo integer 


rotate 


rotate bit left 


shli 


shift left integer 


shlo 


shift left ordinal 


shri 


shift right integer 


shro 


shift right ordinal 


shrdi 


shift right dividing integer 



Extended arithmetic instructions facilitate computation 
on ordinals and integers which are longer than 32 bits. 
In add with carry and subtract with carry instructions, 
the carry out from the previous arithmetic instruction 
is used in the computation. The extended multiply in- 
struction multiplies two ordinal source operands pro- 
ducing a long ordinal result (64 bits). The extended 
divide instruction divides a long ordinal dividend by an 
ordinal divisor and produces a 64-bit result. The ex- 
tended shift right instruction shifts a 64-bit source val- 
ue and produces the lower order 32 bits of the shifted 
value. 

Extended Arithmetic Instructions 

- addc add ordinal with carry 

- subc subtract ordinal with carry 

- emul extended multiply 

- ediv extended divide 

- eshro shift right extended ordinal 



The atomic instructions perform read-modify-write op- 
erations on operands in memory. They allow a system 
to insure that when an atomic operation is performed 
on a specified memory location, the operation will be 
completed before another agent is allowed to perform 
an operation on the same memory. These instructions 
are required to enable synchronization between inter- 
rupt handlers and background tasks in any system. 
They are also particularly useful in systems where sev- 
eral agents (processors, coprocessors, or external logic) 
have access to the same system memory for communi- 
cation. 

Atomic Instructions 
-atadd atomic add 
- atmod atomic modify 

3.5.1.4 Bit and Bit Field Instructions 

The bit instructions operate on a specified bit in a regis- 
ter. 



Bit Instructions 


- setbit 


set bit 


-clrbit 


clear bit 


- notbit 


not bit 


-alterbit 


alterbit 


- scanbit 


scan for bit 


- spanbit 


span over bit 



Bit field instructions operate on a specified contiguous 
group of bits in a register. This group of bits can be 
from to 32 bits in length. 

Bit Field Instructions 

- extract extract field 
-modify modify field 

- scanbyte scan for byte 

3.5.1.5 Branch Instructions 

The branch instructions allow the direction of program 
flow to be changed by explicitly modifying the 
Instruction Pointer (IP). The target IP in a branch in- 
struction is generally specified as a displacement to be 
added to the current IP. The extended branch instruc- 
tions allow IP calculation using any addressing mode. 

The unconditional branch instructions always alter pro- 
gram flow when executed. 

Unconditional Branch 
Instructions 



•b 
bx 



branch 

branch extended 



The RISC branch-and-link instructions automatically 
save a Return Instruction Pointer (RIP) before the 
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jump is taken. The RIP is the address of the instruction 
following the branch and link. 

Branch and Link Instructions 

- bal branch and link 

- balx branch and link extended 

Conditional branch instructions alter program flow 
only if the condition code flags in the arithmetic control 
register match a value specified in the instruction. The 
condition code flags indicate conditions of equality or 
inequality between two operands in a previously execut- 
ed instruction. The arithmetic control register and con- 
dition code flags are described in Section 3.6. 

Based on a branch prediction flag located in the ma- 
chine level instruction, the 80960CA will assume that 
an instruction usually takes or does not take a condi- 
tional branch. By executing along the predicted path of 
program flow, delays due to breaks in the instruction 
stream are often avoided. This feature of the 80960CA 
is referred to as branch prediction. The 80960CA incor- 
porates the branch prediction feature because code us- 
ing a conditional branch instruction usually favors a 
single direction of program flow. 

The branch prediction flag is specified at the assembly 
level by appending a J or ./to a conditional branch 
instruction meaning, respectively, "assume branch tak- 
en" or "assume branch not taken". For example, the 
assembler mnemonic be.t means that the processor will 
assume that this branch-if-equal instruction usually 
branches when encountered. In the following table .p 
represents the branch prediction flag. 

Conditional Branch Instructions 

- be.p branch if equal 

- bne.p branch if not equal 

- bl.p branch if less 

- ble.p branch if less or equal 

- bg.p branch if greater 

- bge.p branch if greater or equal 

- bo.p branch if ordered 

- bno.p branch if unordered 

Compare and conditional branch instructions compare 
two operands, then branch according to the immediate 
results. 

Conditional Compare and 
Conditions Branch Instructions 



- cmpibl.p 

- cmpible.p 

- cmpibg.p 

- cmpibge.p 

- cmpibo.p 

- cmpibno.p 

- cmpobe.p 

- cmpobne.p 

-cmpobl.p 

- cmpoble.p 

- cmpobg.p 

- cmpobge.p 

- bbs.p 

- bbc.p 



compare integer 

and branch if less 
compare integer 

and branch if less 

or equal 
compare integer 

and branch if 

greater 
compare integer 

and branch if 

greater or equal 
compare integer 

and branch if 

ordered 
compare integer 

and branch if 

unordered 
compare ordinal 

and branch if 

, equal 
compare ordinal 

and branch if 

not equal 
compare ordinal 

and branch if less 
compare ordinal 

and branch if less 

or equal 
compare ordinal 

and branch if 

greater 
compare ordinal 

and branch if 

greater or equal 
check bit 

and branch 

if set 
check bit 

and branch 

if clear 




- cmpibe.p 



• cmpibne.p 



compare integer 
and branch if 
equal 

compare integer 
and branch if 
not equal 



3.5.1.6 Compare and Condition Test 
Instructions 

The 80960CA provides several types of instructions 
that are used to compare two operands. The condition 
code flags in the arithmetic control register are set to 
indicate whether one operand is less than, equal to, or 
greater than the other operand. 

Compare Instructions 
-cmpi compare integer 
-cmpo compare ordinal 
-chkbit check bit 

Conditional compare instructions test the existing 
status of the condition code flags before a compare is 
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performed. These conditional compare instructions are 
provided to optimize two-sided range comparisons (i.e. 
to test if a value is less than one number but greater 
than another). 

Conditional Compare Instructions 

- concmpi conditional compare integer 

- concmpo conditional compare ordinal 

The compare and increment and compare and decre- 
ment instructions set the condition code flags based on 
a comparison of two register sources, decrements or 
increments ' one of the sources, and finally stores this 
result in a destination register. 

- cmpinci compare and increment integer 

- cmpinco compare and increment ordinal 

- cmpdeci compare and decrement integer 

- cmpdeco compare and decrement ordinal 

The condition test instructions allow the state of the 
condition code flags to be tested. Based on the outcome 
of the comparison, a true or false code is stored in a 
destination register. The branch prediction flag is used 
in this instruction to reduce the execution time of the 
instruction when the test outcome is predicted correct- 
ly. For example teste.t (test if equal) will execute in a 
shorter time if the condition code flags test true for the 
equal condition. Analogous to the function of the 
branch prediction flag in the conditional compare and 
branch instructions, the prediction flag in this case 
eliminates breaks in the micro-instruction sequence 
which is used to implement the condition test instruc- 
tions. 



Condition Test Instructions 


teste.p 


test if equal 


testne.p 


test if not equal 


testl.p 


test if less 


testle.p 


test if less or equal 


testg.p 


test if greater 


testge.p 


test if greater or equal 


testo.p 


test if ordered 


testno.p 


test if not ordered 



3.5.1.7 Call and Return Instructions 

The 80960CA features an on-chip call and return 
mechanism for making procedure calls to local and sys- 
tem procedures. The call instructions and the call and 
return mechanism is described in Section 3.8. 

Call and Return Instructions 

- call call 

- callx call extended 

- calls call system 
-ret return 



3.5.1.8 Fault Instructions 

The 80960CA will fault automatically as the result of 
certain errant operations which may occur when exe- 
cuting code. Fault procedures are then invoked auto- 
matically to handle the various types of faults. In addi- 
tion, the fault instructions permit a fault to be generat- 
ed explicitly based on the value of the condition code 
flags. The branch prediction flag in these instructions is 
used to reduce the execution time of these instructions 
when the state of the condition code flags are guessed 
correctly. 

Conditional Fault Instructions 



faulte.p 


fault if equal 


faultne.p 


fault if not equal 


faultl.p 


fault if less 


faultle.p 


fault if less or equal 


faultg.p 


fault if greater 


faultge.p 


fault if greater or equal 


faulto.p 


fault if ordered 


faultno.p 


fault if unordered 



The syncf instruction causes the processor to wait for 
all faults to be generated which are associated with any 
prior uncompleted instructions. 

-syncf synchronize faults 



3.5.1.9 Debug Instructions 

The processor supports debugging and monitoring of 
program activity through the use of trace events. The 
debug instructions support debugging and monitoring 
software. 

Debug Instructions 
-modtc modify trace controls 

- mark mark 

- f mark force mark 



3.5.1.10 Processor Management Instructions 

The 80960CA provides several instructions for direct 
control of processor functions and for configuring the 
80960CA's peripherals. A brief description of the proc- 
essor management instructions is given below. 

Processor Management Instructions 

-modpc modify process controls 

-modac modify arithmetic controls 

- syscti system control instruction 

- udma update DMA SRAM 

- sdma setup DMA 

- flushreg flush local registers 
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3.6 Arithmetic Controls 

The Arithmetic Control (AC) Register is a 32-bit on-chip 
register (Figure 3-4). The AC register is used primarily 
to monitor and control the execution of 80960CA arith- 
metic instructions. The processor reads and modifies 
bits in the AC register when performing many arithme- 
tic operations. The AC register is also used to control 
the faulting conditions for some instructions. The 
modac instruction allows the user to directly read or 
modify the AC register. 

The processor sets the condition code flags (bits 0-2) to 
indicate equality or inequality as the result of certain 
instructions (such as the compare instructions). Other 
instructions, such as the conditional branch instruc- 
tions, take action based on the value of the condition 
code flags. Table 3-3 shows the functional assignment 
for each condition code flag. 

Table 3-3. Arithmetic Condition Codes 



Condition 
Code 


Condition 


001 
010 
100 


Greater Than 
Equal 
Less Than 



The integer overflow flag (bit 8) and the integer over- 
flow mask (bit 12) are used in conjunction with the 
arithmetic integer overflow fault. The mask bit masks 
the integer overflow fault. When the fault is masked, 
and an integer overflow occurs, the integer overflow 
flag is set but no fault handling action is taken. If the 
fault is not masked, and an integer overflow occurs, the 
integer overflow fault is taken and the integer overflow 
flag is not set. 

The no imprecise faults flag (bit 15) determines if im- 
precise faults are allowed to occur. Fault handling and 
precise and imprecise faults in the 80960CA are dis- 
cussed in Section 3.10. 



3.7 Process Management 

Process management refers to the monitoring and con- 
trol of certain properties of an executing process. The 
following sections describe the mechanisms available on 
the 80960CA to perform this function. 
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Figure 3-4. Arithmetic Control Register 
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3.7.1 PROCESS CONTROL REGISTER 

The Process Control (PC) Register (Figure 3-5) provides 
access to process state information. The function for 
the PC register is described below. 

Execution Mode Flag — This flag indicates that the 
processor is executing in user mode (0) or supervisor 
mode (1). 

Priority Field^-This 5 -bit field indicates the current ex- 
ecuting priority of the processor. Priority values range 
from to 31, with as the lowest and 31 as the highest 
priority. 

State Flag — This flag determines the executing state of 
the processor. The processor state is either executing 
state (0) or interrupted state (1). 

Trace Enable Bit and Trace Fault Pending Flags — 
These fields control and monitor trace activity in the 
processor. The Trace Enable Bit enables fault genera- 
tion for trace events. The Trace Fault Pending Flag 
indicates that a trace event has been detected. 

The process controls can be modified by software with 
the modify process controls (modpc) instruction. The 
modpc instruction may only write the PC register when 
the processor is in supervisor mode. 



3.7.2 PRIORITIES 

The 80960 architecture defines a means to assign priori-, 
ties to executing programs and interrupts. The current 
priority of the processor is stored in the priority field of 
the PC register. This priority is used to determine if an 
interrupt will be serviced and in which order multiple 
pending interrupts will be serviced. Setting the priority 
of an executing program above that of interrupts allows 
critical code to be prioritized and executed without in- 
terruption. 

The priority field of the PC register can be modified 
directly using the modpc instruction. The priority field 
is also modified to reflect the priority of serviced inter- 
rupts. On a return from an interrupt routine, the priori- 



ty of the processor is restored to its priority before the 
interrupt occurred. 

3.7.3 PROCESSOR STATES AND MODES 

The 80960CA may execute programs in user mode or 
supervisor mode. The user-supervisor protection mecha- 
nism allows a system to be designed in which kernel 
code and data reside in the same address space as user 
code and data, but access to the kernel procedures and 
data is only allowed through a tightly controlled inter- 
face. This interface is the system call table and the in- 
terrupt mecha nism. The 80960CA provides a supervi- 
sor pin (SUP) to implement memory systems which 
protect code and data from possible corruption by pro- 
grams executing in user mode. Some instructions and 
functions of the 80960CA are also insulated from code 
executing in user mode. 

The processor has two operating states: executing and 
interrupted. In executing state, the processor can exe- 
cute in user or supervisor mode. In the interrupted 
state, the processor always executes in supervisor mode. 



3.8 Call and Return Mechanism 

The 80960 architecture features a built-in call and re- 
turn mechanism. This mechanism is designed to make 
procedure calls simple and fast, and to provide a flex- 
ible method for storing and handling variables that are 
local to a procedure. A call automatically allocates a 
new set of local registers and a new stack frame. All 
linkage information is maintained by the processor, 
making procedure calls and returns virtually transpar- 
ent to the user. A system call instruction is provided as 
a method for calling privileged procedures such as a 
kernel service. The call and return model supports effi- 
cient translation of structured high level code (such as 
C, or ADA) to 80960 machine language. 

The procedure call and return mechanism provides a 
number of significant benefits which contribute to the 
performance and ease of use of the 80960CA. 

1) The call and return instructions are implemented en- 
tirely on-chip, resulting in an extremely high per- 
formance implementation of these commonly used 
functions. 
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2) A single instruction to implement each call or return 
operation results in code density improvements com- 
pared to processors which require multiple instruc- 
tions to encode these functions. 

3) By implementing the call and return functions as 
single instructions, the 80960 architecture is open for 
further optimization of these instructions, while 
maintaining assembly-level compatibility. 

4) A program does not have to explicitly save or restore 
the variables stored in the local registers when a call 
or return is executed. The processor does this implic- 
itly on procedure calls and on returns. 

5) The call and return mechanism provides a structure 
for storing a virtually unlimited number of local 
variables for each procedure: the on-chip local regis- 
ters provide quick access to often used variables and 
the stack provides space for additional variables. 

3.8.1 LOCAL REGISTERS AND THE STACK 
FRAME 

At any point in a program, the 80960 has access to a 
local register set and a section of the procedure stack 
referred to as a stack frame. When a call is executed, a 
new stack frame is allocated for the called procedure. 
Additionally, the current local register set is saved by 
the processor, freeing these registers for use by the new- 
ly called procedure. In this way, every procedure has a 
unique stack and unique set of local registers. When a 



return is executed, the current local register set and 
current stack frame are deallocated. The previous local 
register set and previous stack frame are restored. This 
call and return mechanism is illustrated in Figure 3-6 
where n is procedure depth for the currently executing 
procedure. 

The procedure stack structure is defined by the 80960 
architecture. The procedure stack always grows up- 
ward (i.e. towards higher addresses) and the stack 
pointer (SP) always points to the next available byte of 
the stack frame. The 80960CA requires that each stack 
frame begins on a 16-byte boundary. Due to this align- 
ment requirement, a padding space of to 15 bytes may 
exist between adjacent stack frames in memory. When 
a stack frame is allocated, the first 16 words are always 
assigned as storage for the local registers; therefore, the 
SP initially points to the 17th word in the stack frame. 
It should be noted that although each stack frame is 
assigned storage space for the local registers, these loca- 
tions in the stack are not guaranteed to contain the 
values of the saved local registers. This is because sever- 
al sets of local registers are cached on-chip rather than 
written to the stack in external memory. This caching 
mechanism is described in detail later in this section. 



3.8.2 PROCEDURE LINKING 

The 80960 architecture automatically manages proce- 
dure linkage. One global register and three local regis- 
ters are reserved for procedure linkage information. 
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Figure 3-7 describes the pointer structure used to link 
frames and to provide a unique SP for each frame. Reg- 
ister gl5 is the Frame Pointer (FP). The FP is the ad- 
dress of the first byte of the current (topmost) stack 
frame. The FP is always updated to point to the current 
frame when calls and returns are executed. Register rO 
is the Previous Frame Pointer (PFP). The PFP is the 
address of the first byte of the stack frame which was 
created prior to the frame containing this PFP. Register 
rl is the Stack Pointer (SP). The SP points to the next 
available byte of the stack frame. Register r2 is reserved 
for the Return Instruction Pointer (RIP). The RIP is 
the address of the instruction which follows a call in- 
struction, this is also the target address for the return 
from that procedure. The RIP is automatically stored 
in register r2 of the calling procedure when a call is 
executed. 

3.8.3 PARAMETER PASSING 

Parameters may be passed by value or passed by refer- 
ence between procedures. The global registers, the 
stack, or predefined data structures in memory may be 
used to pass these parameters. 



The global registers provide the fastest method for pass- 
ing parameters. The values to be passed into a proce- 
dure reside in the global registers of the calling proce- 
dure. When a procedure is called, the values in the 
global registers are preserved. If more parameters are to 
be passed than will fit in the global registers, additional 
parameters may be passed in the stack of the calling 
procedure, or in a data structure which is referenced by 
a pointer passed in the global registers. 

3.8.4 LOCAL REGISTER CACHE 

The 80960CA provides an on-chip cache for saving and 
restoring the local registers on calls and returns. This 
cache greatly enhances performance of the call and re- 
turn mechanism on the 80960CA. Movement of data 
between the local registers and the register cache is typ- 
ically accomplished in only 4 processor clocks with no 
external bus traffic. When this cache is filled, the regis- 
ters associated with the oldest stack frame are moved to 
the area reserved for those registers on the physical 
stack (Figure 3-7). 
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Figure 3-7. Stack Frame Linkage 
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The local register cache is a physical extension of the 
internal data RAM. The part of the data RAM used for 
this cache is not visible to the user and is large enough 
to hold up to 5 sets of local registers. The register cache 
may be extended to hold up to 15 sets of local registers. 
When extended, each new register set consumes 16 
words of the user's data RAM, beginning at the highest 
address and growing downward. The size of the local 
register cache is selected when the processor is initial- 
ized. 

In some cases, the contents of the cached local register 
sets may require examination or modification (e.g. for 
fault handling). Since the local registers are cached, the 
flushreg instruction is provided to flush the local regis- 
ter cache to the locations reserved for the registers on 
the stack. This insures that the values in external mem- 
ory are consistent with the values held in the local reg- 
ister cache. 



3.8.5 LOCAL AND SYSTEM CALLS 

The 80960CA provides two methods for making proce- 
dure calls: local calls and system calls. Local and sys- 
tem calls differ in their operation and use in an applica- 
tion. 



The local call instructions initiate a procedure call us- 
ing the call and return mechanism described earlier. 
The stack frames for these procedure calls are allocated 
on the local procedure stack. A local call is made using 
either of two local call instructions: call or callx. The 
call instruction specifies the address of the called proce- 
dure using an IP plus displacement addressing mode 
with a range of — 2 23 to 2 23 — 4 bytes from the current 
IP. The callx (call extended) instruction specifies the 
address of the calling procedure using any of the 
80960's addressing modes. 

A system call is made using the calls instruction. This 
call is similar to a local call except that the processor 
gets the IP for the called procedure from a data struc- 
ture called the system procedure table. The calls in- 
struction requires a procedure number operand. This 
procedure number serves as an index into the system 
procedure table, which contains IP's for specific proce- 
dures. The system procedure table is shown in Figure 
3-8. 

The system call mechanism supports two types of pro- 
cedure calls: system-local calls and system-supervisor 
calls (also referred to as supervisor calls). The system- 
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local call performs the same action as the local call 
instructions with one exception: the IP target for a sys- 
tem-local call is fetched from the system-procedure ta- 
ble. The supervisor call differs from the local call as 
follows: 

1) A supervisor call causes the processor to switch to 
another stack (called the supervisor stack). 

2) A supervisor call causes the processor to switch to 
the supervisor execut ion mode and asserts the 
80960CA's supervisor (SUP) pin for all bus accesses. 

The system call mechanism offers several benefits. The 
system call promotes the portability of application soft- 
ware. System calls are commonly used for kernel serv- 
ices. By calling these services with a procedure number 
rather than a specific IP, application software does not 
have to be changed each time the implementation of the 
kernel service is modified. Additionally, the ability to 
switch to a different execution mode and stack allows 
kernel procedures and data to be insulated from appli- 
cation code. 

3.8.6 IMPLICIT PROCEDURE CALLS 

The call and return mechanism described for procedure 
calls applies to several classes of call instructions as 
well as to the context switching initiated by interrupts 
and faults. When an interrupt or fault condition occurs, 
an implicit call is performed that saves the current state 
of the processor before branching to the interrupt or 
fault handling procedure. When this context switch oc- 
curs, the local registers are saved and a new stack frame 
is allocated. Additionally, the values of the AC register 
and PC register are saved when the implicit call occurs. 
These values are restored on the return from the inter- 
rupt or fault handler. 



3.9 Interrupts 

An interrupt is a temporary break in the control stream 
of a program so that the processor can handle another 
task. Interrupts may be triggered by the instruction 
stream or by hardware sources internal and external to 
the 80960CA. An interrupt request is associated with a 
vector (i.e. an address) of an interrupt handling proce- 
dure. The processor will branch to the handling proce- 
dure when an interrupt is serviced. When the handling 
action is completed, the processor is restored to its state 
prior to the interrupt: 

3.9.1 INTERRUPT VECTORS AND PRIORITY 

Interrupt vectors are simply instruction pointers (ad- 
dresses) to interrupt handling procedures. The 80960 
architecture defines 248 interrupt vectors. This means 



that 248 unique interrupt handling procedures may be 
used. An 8-bit interrupt vector number is associated 
with each interrupt vector. This number ranges from 8 
to 255. Each interrupt vector has a priority from 1 to 
31, which is determined by the 5 most significant bits of 
the interrupt vector number. Priority 1 is the lowest 
priority and 3 1 is the highest. Priority interrupts are 
not defined. 

The 80960CA executes with a unique priority ranging 
from to 31. When an interrupt is serviced, the proces- 
sor's priority switches to the priority corresponding to 
that of the interrupt request. When a return from an 
interrupt procedure is executed, the process priority is 
restored to its value prior to servicing the interrupt. 
This priority switching is handled automatically by the 
80960CA. 

The 80960CA compares its current priority and the pri- 
ority of an interrupt request to determine whether to 
service an interrupt immediately or to delay service. If a 
requested interrupt priority is greater than the proces- 
sor's current priority or equal to 31, the processor serv- 
ices the interrupt immediately; otherwise, the processor 
saves (posts) the interrupt request as a pending inter- 
rupt so that it can be serviced later. When the proces- 
sor's priority falls below the priority of a pending inter- 
rupt, the pending interrupt is serviced. With the mecha- 
nism described, interrupts with a priority of will nev- 
er be serviced. For this reason, vectors numbered to 7 
are not defined. 



3.9.2 INTERRUPT TABLE 

The interrupt table (Figure 3-9) is an architecturally 
defined data structure which holds the interrupt vectors 
and information on pending interrupts. The first 36 
bytes of the table are used to post interrupts. The 3 1 
most significant bits in the 32-bit pending priorities 
field represent a possible priority (1 to 31) of a pending 
interrupt. When the processor posts an interrupt in the 
interrupt table, the bit corresponding to the interrupts 
priority is set. For example, if an interrupt with a prior- 
ity of 10 is posted in the interrupt table, bit 10 is set in 
the pending priorities field. 

The pending interrupts field contains a 256-bit string in 
which each bit represents an interrupt vector. When the 
processor posts an interrupt in the interrupt table, the 
bit corresponding to the vector number of that inter- 
rupt is set. 

Portions of the interrupt table are cached on-chip in a 
non-transparent fashion. This caching is implemented 
to minimized interrupt latency by reducing the number 
of accesses to the table in external memory when an 
interrupt is serviced. 
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Figure 3-9. Interrupt Table 

3.9.3 INTERRUPT STACK 

Stack frames for interrupt handling procedures are allo- 
cated on a separate interrupt stack. The interrupt stack 
can be located anywhere in the processor's address 
space. The beginning address of the interrupt stack is 
specified when the processor is initialized. 

3.9.4 INTERRUPT HANDLING ACTION 

When an interrupt is serviced, the processor saves the 
processor state and calls the interrupt procedure. The 
processor state is restored upon return from the inter- 
rupt procedure. 

This interrupt service mechanism is handled by an im- 
plicit call operation. When the interrupt is serviced, the 
current local registers are saved. A new local register 
set and stack frame are allocated on the interrupt stack 
for the interrupt handler procedure and the processor 
switches to supervisor execution mode. In addition to 
the local registers, the current value of the AC and PC 
registers are saved as an interrupt record on the inter- 
rupt stack. 

3.9.5 PENDING INTERRUPTS 

Any of the 248 interrupts can be requested by software. 
The system control instruction (sysctl) is provided to 
support this feature. When the system control instruc- 
tion requests an interrupt, one of two actions may oc- 
cur depending on the priority of the requested interrupt 



and the current process priority. 1) The interrupt is 
serviced immediately, or 2) the interrupt is posted (the 
pending priorities field and the pending interrupts field 
are modified to reflect a pending interrupt). 

Interrupts may also be requested by hardware sources 
internal and external to the 80960CA. Managing the 
hardware sources and posting these interrupts is han- 
dled by the interrupt controller. Interrupts requested by 
hardware are posted in an internal register, not in the 
interrupt table. A mask register enables or disables in- 
terrupts from each hardware source. Requesting and 
posting hardware interrupts is described in Section 4.4 
Interrupt Controller. 

3.9.6 INTERRUPT LATENCY 

The time required to perform an interrupt task switch 
is referred to as the interrupt latency. The latency is the 
time measured between the activation of an interrupt 
source and the execution of the first instruction for the 
interrupt-handling procedure for the source. 

Interrupt latency for the 80960CA varies depending on 
conditions such as: 

— Complex instructions are executing when the inter- 
rupt occurs (e.g. sysctl, call, ret, etc.). 

— Outstanding loads to a local register are pending, 
delaying the interrupt context switch. 

— - Division, multiplication, or other multi-cycle in- 
structions with a local register as destination are 
executing. 

The 80960CA has been designed to optimize latency 
and throughput for interrupts. Two processor features 
are designed for this purpose: 

First, in the interrupt table, all interrupt vectors with 
an index whose least significant four bits are OOIO2 can 
be cached in internal data RAM. The processor will 
automatically read these vectors from data RAM when 
the interrupt is serviced. This feature reduces the added 
latency due to an external access of the interrupt table 
for that vector. The NMI vector is always cached in 
data RAM. 

Second, an instruction cache locking mechanism allows 
interrupt procedures or segments of interrupt proce- 
dures to be stored in the instruction cache. These rou- 
tines are always executed from the internal cache, elim- 
inating external code fetches and reducing latency and 
increasing throughput for the interrupt. 
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3.10 Fault Handling and Instruction 
Tracing 

The 80960CA is able to detect various conditions in 
code or in its internal state that could cause the proces- 
sor to deliver incorrect or inappropriate results or that 
could cause it to head down an undesirable control 
path. These conditions are referred to as faults. The 
80960 architecture provides fault handling mechanisms 
to detect and, in most cases, fully recover from a fault. 

The 80960CA provides on-chip debug support by trig- 
gering trace events and servicing the trace fault. A trace 
event is activated when a particular instruction or type 
of instruction is encountered in an instruction stream. 
The trace event optionally signals a fault. A fault han- 
dling procedure for the trace fault can act as a debug 
monitor and analyze the state of the processor when the 
trace event occurred. 



3.10.1 FAULT TYPES AND SUBTYPES 

All of the faults that the processor detects are pre- 
defined. These faults are divided into types and sub- 
types, each of which is given a number. Table 3-4 lists 
the faults that the processor detects arranged by type 
and subtype. 



Table 3-4. Fault Types and Subtypes 



Fault Type 


Fault Subtype 


Fault Record 


Parallel 




xxoo ooxx 


Trace 


Instruction Type 
Branch Trace 
Call Trace 
Return Trace 
Prereturn Trace 
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Figure 3-11. Fault Record 



3.10.2 FAULT TABLE 

The fault table (Figure 3-10) provides the processor 
with a pathway to fault handling procedures. The fault 
table is an architecture-defined data structure, which 
may be located anywhere in the processor's address 
space. The location of the fault table is specified at ini- 
tialization. When a fault occurs, an entry in the table is 
selected based on the type of fault that occurs. The 
entry in the fault table contains a pointer to a specific 
fault handler. 

The fault table can contain two types of entries (Figure 
3-10). The first type of entry is simply a pointer to the 
address of the fault-handling procedure. The second 
type of entry is an index into the system-procedure ta- 
ble. Fault-handling procedures accessed through the 
system-procedure table may be executed in user or su- 
pervisor execution mode. 



3.10.3 FAULT HANDLING ACTION 

When a fault occurs, the processor performs an implicit 
call operation to the procedure specified in the fault 
table. In addition to performing the implicit call opera- 
tion, the processor creates a fault record in its newly 
allocated stack frame. This fault record contains infor- 
mation on the state of the processor when the fault 
occurred and the fault type and subtype (Figure 3-11). 

Some faults can be recovered from easily. When recov- 
ery from a fault is possible, the processor's fault han- 
dling mechanism allows the processor to automatically 
resume work where the fault was signalled. The re- 
sumption action is initiated with the ret instruction. If 
simple recovery from a fault is not possible, then the 
fault handling procedure may call a debug monitor, ini- 
tiate a reset, or take other actions to recover from the 
fault. 



3.10.4 TRACING AND DEBUG 

The 80960CA provides a facility for monitoring the ac- 
tivity of the processor by tracing the instruction stream. 
A trace event occurs at points in a program where cer- 
tain types of instructions are encountered or a certain 



IP or data address is encountered. When a trace event 
occurs, a trace fault can be generated and a trace-fault 
handler called which displays or analyzes the state of 
the processor. 

3.10.4.1 Trace Events 

The Trace Control (TC) Register (Figure 3-12) is used 
to specify the types of instructions which cause trace 
events. When a mode bit in the TC register is set, spe- 
cific instructions will generate trace events. For exam- 
ple, if the branch trace mode bit is enabled and a 
branch instruction is executed, a branch trace event will 
be signalled. An event flag is used to record trace 
events. A single event flag is provided for each mode 
bit. Any trace event generates a trace fault when the 
trace enable bit in the process control register is set. 

The 80960CA recognizes 7 trace events. These events 
are described below. 

Instruction Trace Event — Signalled each time an in- 
struction is executed. This trace event can be used with 
a debug monitor to single step the processor. 

Branch Trace Event — Signalled each time a branch in- 
struction is executed. For conditional branch instruc- 
tions, this event is only signalled when the branch is 
taken. Branch-and-link, call, and return instructions do 
not signal this trace event. 

Call Trace Event — Signalled each time a branch-and- 
link or call instruction is executed. Implicit calls, such 
as those used in interrupt or fault handling, signal this 
event. When a call trace event occurs, the prereturn 
trace flag (bit 3 in local register rO) is set by the proces- 
sor to indicate a prereturn trace pending. 

Pre-Return Trace Event — Signalled just prior to any ret 
instruction. This event is only signalled if the pre-return 
trace flag in register rO is set. Since the pre-return trace 
flag is set when a call trace event occurs, the call trace 
mode must be enabled before a pre-return trace event 
can be signalled. 




Return Trace Event- 
tion is executed. 



-Signalled each time a ret instruc- 
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Trace Event Flags: 



Reserved 
(Initialized to 0) 



Instruction Trace Mode 
Branch Trace Mode 
Call Trace Mode 
Return Trace Mode 
Prereturn Trace Mode 
Supervisor Trace Mode 
Breakpoint Trace Mode 

Instruction Trace 

Branch Trace 

Call Trace 

Return Trace 

Prereturn Trace 

Supervisor Trace 

Breakpoint Trace 

Data Address Breakpoint 

Data Address Breakpoint 1 

Instruction Address Breakpoint 

Instruction Address Breakpoint 1 
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Figure 3-12. Trace Control Register 



Supervisor Trace Event — Signalled each time a calls 
instruction is executed where the selected entry type is 
supervisor, or when a ret from supervisor mode is exe- 
cuted. 

Breakpoint Trace Events — Signalled each time a mark 
instruction, fmark instruction, or specified address is 
encountered in the instruction stream. The mark in- 
struction signals an event when the breakpoint trace 
mode is enabled, the fmark (force mark) instruction 
will generate a breakpoint trace event regardless of the 
value of the breakpoint trace mode bit. 

Two IP breakpoint registers and two internal data ad- 
dress breakpoint registers are provided on the 
80960CA. These breakpoints are loaded with an in- 
struction or data address using the system control 
(sysctl) instruction. When the address is encountered 
and the breakpoint trace mode bit is set, a breakpoint 
trace event occurs. A corresponding instruction or data 
address event flag is set in the TC register when the 
address is encountered. 



3.10.5 PROCESSOR INITIALIZATION 

The Initial Memory Image (IMI) are the data struc- 
tures needed to initialize the 80960CA (Figure 3-13). 
The initialization boot record, in reserved memory be- 
ginning at FFFFFF00H, contains a pointer to the Proc- 
essor Control Block (PRCB). The PRCB in turn holds 
pointers to the data structures which are necessary to 
execute code on the 80960CA. The PRCB also holds 
several fields which contain information to initially 
configure the 80960CA. 



Processor initialization begins by asserting the RESET 
pin. At initialization the processor optionally performs 
an internal self-test. A bus confidence test is also per- 
formed by calculating a checksum of 8 words read from 
extern al memory. If either of these self-tests fails, the 
FAIL pin indicates the failure and the processor aborts 
initialization. If the self-test passes, the 80960CA con- 
tinues with initialization and branches to the first ad- 
dress of the user's code. 
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Address 
FFFFFFOOH 

FFFFFF10H 

FFFFFF14H 
FFFFFF18H 

FFFFFF2CH 


Fixed Data Structures 
Initialization Boot Record: 






Relocatable Data Structures 
User Code: 


* 


-19 


Bus 

Configuration 

(Least Significant Byte) 






-* 


1 \ 




First Instruction Pointer 


Process Control Block (PRCB): 








Byte Offset 


PRCB Pointer 




Fault Table Base Address 


OH 

4H 

8H 

CH 

10H 

14H 

18H 

1CH 

20H 

24H 




6 Check Words 
(for bus confidence self-test) 






Control Table Base Address 






AC Register Initial Image 


Fault Configuration Word 


Interrupt Table Base Address 




System Procedure Table 
Base Address 










.Reserved 


Interrupt Stack Pointer 




Instruction Cache 
Configuration Word 


Register Cache 
Configuration Word 










Control Table 




* <c 


Interrupt Table 




X 


1 




System Procedure Table 


X 


1 


Other Architecturally Defined 

Data Structures 
(not required as part of IMI) 
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Figure 3-13. Initial Memory Image 
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4.0 80960CA SYSTEM 
IMPLEMENTATION 

This section is an overview of the peripherals integrated 
with the 80960CA core. The features and operation of 
the Bus Controller, DMA Controller, Interrupt Con- 
troller, and the interfaces between these peripherals and 
the core are described. 



Bus Request — A bus request is issued by the core and 
directed to the Bus Controller. A bus request is sent to 
the BCU when a load, store, or an atomic instruction is 
executed, or when an instruction fetch is needed. Bus 
requests are also issued by the core to perform DMA 
transfers. A bus request can consist of one or more bus 
accesses. For example, an aligned word (32-bit) request 
to an 8-bit memory region will result in four byte- 
length accesses. 



4.1 Peripheral Interface 

A program communicates with the on-chip peripherals 
by reading or modifying the special function registers 
(SFRs) or by loading control registers. The SFRs gen- 
erally serve to transfer status information and data be- 
tween a peripheral and the core, and the control regis- 
ters serve to configure the peripherals. SFRs are ac- 
cessed directly as instruction operands. The control 
registers are loaded by using the system control (sysctl) 
instruction. 



4.2 Bus Controller Unit 

The Bus Controller Unit (BCU) manages the data and 
instruction path between the 80960CA and external 
memory. Data operations and instruction fetches share 
a 32-bit data bus. Memory addresses are output on a 
separate 32-bit address bus. The BCU incorporates sev- 
eral advanced features to simplify the bus interface to 
external memory. A programmable memory region con- 
figuration table allows the characteristics of the exter- 
nal bus to be programmed differently for 16 separate 
regions in memory. The attributes of the external bus 
which are programmable include wait states and exter- 
nal ready control, data bus width (8, 16, or 32 bits), 
burst mode, address pipelining, and byte ordering. The 
region programmable bus options are described in this 
section. 



4.2.1 BUS TRANSFERS, ACCESSES, AND 
REQUESTS 

The distinction between transfer, bus access, and bus 
request, as these terms apply to the 80960CA, must be 
presented before beginning a discussion of the BCU. 

Transfer — A bus transfer is defined simply as a move- 
ment of code or data between a memory system and the 
80960CA. A write transfer occurs when the memory 
system is the destination of a data movement. A read 
transfer occurs when the 80960CA is the destination 
for a data or a code fetch from memory. 

Bus Access — A bus access is defined as an address cycle 
and one or more transfers. In burst mode, an access can 
consist of a single address cycle and 1 to 4 transfers. 



4.2.2 BUS CONTROL COPROCESSOR 

The 80960CA's peripherals are often referred to as co- 
processors, since their operation is decoupled from the 
execution of the instruction stream. As an integrated 
coprocessor, the BCU receives bus requests and inde- 
pendently carries out the action of moving data or code 
between the processor and external memory. The BCU 
uses a three deep queue to store pending bus requests. 
The queue decouples the core from the BCU, since a 
series of adjacent requests may be issued faster than the 
BCU can service each request. Two of the three queue 
entries store requests from a user's program (loads, 
stores, fetches, etc.). The third queue entry is used by 
requests originating from a DMA operation. This 
queue entry takes user requests when the DMA is 
turned off. The 80960CA alternates service of requests 
issued by the user program and requests issued by a 
DMA operation. 

4.2.3 SIGNAL DESCRIPTIONS 

The external bus signals consist of 30 address signals, 4 
byte enables, 32 data lines, and various control signals. 

D31-D0 32-bit Data Bus (bi-directional)— 32-, 16-, 
and 8-bit values are transmitted and re- 
ceived on these lines. The 8- and 16-bit 
quantities are transferred on the low order 
data lines when a memory region is config- 
ured respectively for an 8- or 16-bit bus. 

A31-A2 30-bit Address (outputs)— The 30-bit ad- 
dress bus identifies all external addresses to 
word (4-byte) boundaries. The byte enable 
lines indicate the selected byte in each 
word. 

BE3-BE0 Byte Enables (outputs)— The byte enables 
select which of 4 addressed bytes are active 
in a memory access. When a memory re- 
gion is c onfigu red for an 8-bit bus width, 
BE1 and BEO act as the lower two bits of 
the addres s. F or a 16-bit memory region, 
BE1 , BE3 , and BEO are encoded to provide 
Al, BHE, and BLE respectively. 

W/R Write or Read (output) — This signal is low 

for read accesses and high for write access- 
es. 



ADS 



Address Strobe (output) — Indicates valid 
address and the start of a new bus access. 
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DT/R Data Transmit or Receive (output) — 

Direction control for data transceivers; 
similar to W/R. 

DEN Data Enable (output) — Low during a 

bus request after the first address cy- 
cle. This signal is used to control data 
transceivers and to indicate the end of 
a bus request. 

WAIT Wait (output)— Indicates that wait 

states are being inserted by the internal 
wait state generator. 

READY Ready (input)— Signals that data is 

valid for a read transfer or ends data 
hold for a write transfer. This function 
can be disabled for a memory region. 

BTERM Burst Terminate (input) — Terminates 

a burst access. Another address is gen- 
erated to complete the request when 
the signal is deasserted. This function 
can be disabled for a memory region. 

D/C Data or Code (output) — Indicates a 

data transfer or a code fetch. 

DMA DMA Access (output) — Indicates that 

a bus request was initiated by either 
the user program or the DMA. 

SUP Supervisor Access (output) — Indicates 

that a bus access originated from a bus 
request issued in supervisor mode. 
This signal can be used to protect sys- 
tem data structures, or peripherals 
from errant modification by the user 
code. 



LOCK Lock (output) — Indicates that an 

atomic memory operation is in prog- 
ress. This signal can be used to inhibit 
external agents from modifying memo- 
ry which is atomically accessed. 

BLAST Burst Last (output) — Indicates the last 

transfer in a burst access. 

HOLD Hold (input)— HOLD can be used by 

a bus requester to request access to the 
bus. The processor asserts HLDA af- 
ter the current bus request or locked 
requests have completed. 

HOLDA Hold Acknowledge (output) — Indi- 

cates to a bus requester that the proc- 
essor has relinquished control of the 
bus. 

BREQ Bus Request (output) — Indicates that 

requests are queued in the bus control- 
ler and are waiting to be serviced. 
BREQ can be used for external bus ar- 
bitration logic in conjunction with 
HOLD and HLDA to regain bus mas- 
tership. 

Figure 4-1 shows the timing for a simple, non-burst, 
non-pipelined read and write access. The timing rela- 
tions for the key control signals are shown in this fig- 
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Figure 4-1. Basic Read and Write Request 
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4.2.4 MEMORY REGION CONFIGURATION 
TABLE 

The BCU can be configured differently for 16 separate 
sections (referred to as regions) of the address space. 
The four most significant bits of a memory address de- 
fine the location of each region in memory. The bus 
characteristics in a region are specified in the memory 
region configuration table. When a bus request is serv- 
iced, the BCU accesses the configuration table entry for 
the region addressed and services the request based on 
the bus characteristics programmed for that region. 
The characteristics programmed for each region are 
listed below: 



Burst Mode (on/of!) 

Wait States 
(5 parameters) 

Bus Width 

(8-, 16-, or 32-bit) 



— Ready Inputs (on/off) 

— Address Pipelining 
(on/of!) 

— Byte Ordering 
(Big/Little Endian) 



The flexibility of region programming simplifies the bus 
interface in applications where a memory system is 
made up of a variety of sub-systems, such as SRAM, 
DRAM, ROM, and memory mapped peripherals. Each 
memory sub-system can be mapped into a different re- 
gion in memory, and that region can be configured spe- 
cifically for the requirements of the particular memory 
sub-system. 



The configuration table is made up of 16 on-chip con- 
trol registers (Figure 4-2). Each register is programmed 
with the configuration information for a single region. 
Since the region table is located on-chip, access to re- 
gion information does not affect the performance of the 
bus. 



4.2.4.1 Burst Accesses 

The 80960CA BCU is capable of burst accesses to 
memory systems which are designed to support this fea- 
ture. Burst mode is intended to get the most perform- 
ance from low cost memory systems. A burst access is a 
single address cycle followed by successive data or in- 
struction transfers. The transfers reference data or in- 
structions at sequential addresses starting at the address 
which began the burst access (Figure 4-3). In a burst 
memory system, the upper 28 bits of an address remain 
fixed while the lower two bits A2 and A3 increment to 
access subsequent locations. 

Wait state timing for the first access of a burst request 
is controlled independently from the timing for subse- 
quent accesses. A memory sub-system using static col- 
umn mode or page mode DRAMs, for example, can 
take advantage of the short column access times for 
these devices by using burst mode. Interleaved ROM or 
EPROM systems can also be constructed which simul- 
taneously access several words and then use burst mode 
to multiplex the multi-word array onto the data bus. 
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Figure 4-2. Memory Region Configuration Table 
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Figure 4-3. Burst Memory Request 
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Figure 4-4. Programmable Wait States 
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4.2.4.2 Programmable Wait State Generation 

The 80960CA may be interfaced with a variety of mem- 
ory sub-systems and peripherals with a minimum sys- 
tem cost and complexity. To achieve this interface flexi- 
bility, the 80960CA implements an internal program- 
mable wait state generator. Internally generated wait 
states eliminate the potential system delays which come 
from generating wait states with external logic. 

Wait states are programmed for each region in the 
memory region configuration table. The number of wait 
states is programmable over a range which allows effi- 
cient control of memory devices ranging from ultra-fast 
SRAMs to slow peripherals. An external ready signal is 
also provided for external wait state control. 

The wait states which can be generated by the 
80960CA are shown in Figure 4-4. In this table N is the 
number of wait states inserted. The wait states for read 
accesses and for write accesses are described by three 
parameters each. For read accesses, Nr^D ls tne num- 
ber of states between the address cycle and the first 
data cycle and Nrdd ls tne number of states between 
consecutive data cycles in a burst access. For writes, 
NwAD is tne number of states that data is held after an 
address cycle, and Nwdd * s tne number of states that 
data is held for consecutive data cycles in a burst write. 
For both reads and writes, Nxda * s the number of 
dead cycles after the last data cycle and before the next 
address. 



4.2.4.3 READY Control 

The memory region configuration table allows the 
ready input (READY) to be enabled or disabled for 
each region. If the ready input is disabled, the external 
input has no effect on the wait states generated for a 
memory access; all wait states are generated internally. 
If the ready input is enabled, it works in conjunction 
with the programmable wait state generator. In this 



case, the ready input has no effect until the number of 
programmed wait states has expired. When the wait 
state counter reaches 0, the ready input is sampled, and 
wait states continue or are terminated based on the val- 
ue of the ready input. In order to gain complete exter- 
nal control over wait states, all wait state parameters 
for a region can be set to 0. 

4.2.4.4 Pipelined Reads 

The 80960CA BCU provides an address pipelining 
mode (Figure 4-5) to optimize the performance of in- 
struction and data fetches from external memory. 
When the pipelined read mode is enabled, an address 
cycle overlaps with the last data cycle in each access, 
effectively reducing the total time needed for each ac- 
cess. Pipelining mode is selected in each region by pro- 
gramming the memory region configuration table. 

4.2.4.5 Byte Ordering 

One of two configurations for byte ordering, often re- 
ferred to as little endian or big endian, is selected for 
each region by programming the memory region con- 
figuration table. The byte ordering options make the 
80960CA capable of sharing memory with a processor 
which uses either byte ordering scheme. Byte ordering 
refers to the way that the 80960CA relates internal data 
to the way that data is stored or fetched from memory. 
The little endian configuration orders the bytes in a 
short- word or word so that the least significant byte of 
the quantity is positioned at the lowest address and the 
most significant byte at the highest address in memory. 
Conversely, for the big endian configuration, the least 
significant byte is positioned at the highest address, and 
the most significant byte at the lowest address. For ex- 
ample, for little endian ordering, byte for word data 
would be found in memory at an address of the form 
XXXX XXXOH and, for big endian, at address XXXX 
XXX3H. 
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Figure 4-5. Pipelined Read Request 
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4.2.4.6 Data Alignment 

The 80960CA can service any aligned or non-aligned 
bus request. Aligned requests are directed to their natu- 
ral boundary in memory. In other words, the addresses 
for aligned requests are even multiples of the length of 
the data transferred: Non-aligned requests are not serv- 
iced directly by the BCU but are assisted by microcode. 
Microcode automatically breaks non-aligned requests 
into multiple aligned requests which are then reissued 
to the BCU. Depending on the degree of non-alignment 
and the length of the original request, the resulting re- 
quests by microcode will consist of a combination of 
byte, short-word, and double-word requests. The BCU 
is able to generate an operation-unaligned fault when a 
non-aligned bus request is first received. This fault can 
be selectively masked at initialization. 



4.3 DMA Controller 

The DMA controller is a high-performance, full-func- 
tioned integrated peripheral. The DMA controller can 
manage 4 channels of DMA transfer concurrent with 
program execution. Separate external control for each 
channel is provided. Each channel supports high-per- 
formance memory to memory transfers where the 
source and destination can be any combination of inter- 
nal data RAM or external memory. The DMA Con- 
troller supports various types of transfers such as high- 
speed fly-by transfers and data chaining with the use of 
linked descriptor lists in memory. 

The 80960CA's DMA controller is implemented using 
dedicated hardware and microcode. Because of the effi- 
ciency of the core, it is possible for the microcode to 
execute DMA transfers at high speeds. DMA transfers 
are performed by the core concurrently with execution 
of the user's program. Internal DMA logic is used for 
sampling requests, synchronizing transfers with exter- 
nal devices, and handling the service of multiple active 
channels. 



4.3.1 SIGNAL DESCRIPTIONS 

Twelve pins are dedicated to the DMA controller. 
Three pins are associated with each DMA channel. 
These pins are described below. In this description, the 
pin number co rrespond s to the channel number. For 
example, the DREQO pin is the request pin for 
channel 0. 

DMA Request (input) — This input in- 
dicates that an external device is re- 
questing a DMA transfer. A DMA 
transfer refers to the complete transfer 
of one byte, short-word, word, or quad- 
word, depending on the transfer data 
width selected for the channel. 



DACK3- 



DACKO 



D REQ3- 
DREQO 



DMA Acknowledge (output) — This 
output becomes active when the re- 
questing device is accessed. 

E OP3/TC3- End of Process (input) or Terminal 
EOP0/TC0 Count (output)— This pin functions ei- 
ther as an input (EOPx) or as an output 
(TCx). When programmed as an out- 
put, the pin is driven active for one 
clock after byte count reaches zero and 
a DMA terminates. When programmed 
as an input, an external device can 
cause the DMA operation to terminate. 

4.3.2 DMA TRANSFERS 

The 80960CA DMA controller supports a variety of 
transfer modes and variations of these modes, allowing 
the DMA to adapt to a number of hardware systems 
and the performance requirements of these systems. 

4.3.2.1 Standard Block and Demand Mode 
Transfers 

A standard DMA transfer is made up of multiple bus 
requests. Loads from a source address are followed by 
stores to a destination address. The DMA controller 
issues the proper combination of these bus requests to 
execute the DMA transfer. For example, a typical 
DMA transfer between memory and an 8-bit peripheral 
could appear as a single byte load request directed to 
the source memory, followed by a single byte store re- 
quest directed to the 8-bit peripheral. 

The DMA controller has two basic transfer modes: 
block mode (unsynchronized) and demand mode (syn- 
chronized). Any DMA transfer will be serviced by one 
of these basic transfer modes. 

A block mode DMA is initiated by software. Block 
mode DMAs are generally between memory. Block 
mode DMA transfers are not synchronized with any 
type of request from an external device. Once the DMA 
begins, it will continue until the entire block is com- 
plete or until it is suspended. The source and destina- 
tion addresses for block mode transfers can be incre- 
mented or held constant for a DMA. 

A demand mode DMA is controlled by an external 
device. Demand mode DMAs are generally between an 
external device and memory. In demand mode, each 
individual DMA transfer can be synchronized with a 
request. The request is signalled when an ex ternal de - 
vice acti vates a DMA channel request pin (DREQ3- 
DREQO). The DMA controller acknowledg es this re - 
quest w ith the DMA acknowledge pin (DACK3- 
DACKO) when the requesting device is accessed. A de- 
mand mode transfer may be synchronized with either 
the source or the destination device. 
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4.3.2.2 Fly-by Transfers 

A fly-by transfer mode is provided for the most per- 
formance-critical DMA applications. Fly-by mode also 
makes very efficient use of the external bus during a 
DMA. Standard DMA transfers involve multiple bus 
requests: load requests directed to the source and a 
store request directed to the destination. Fly-by trans- 
fers only require a single bus request. For a fly-by trans- 
fer, memory sees a load or a store on the bus while the 
requesting device is selected by the DMA acknowledge 
pin. The data is never actually read from or written to 
the 80960CA. For memory to device transfers, the 
processor issues a load, and, while reading the memory, 
accesses the external device with the DMA acknowl- 
edge pin. The data is then written directly to the desti- 
nation device with a single bus request. For a device to 
memory transfer, the reverse operation is performed. 
The DMA issues a store, and, while writing the memo- 
ry, accesses the source device with the DMA acknowl- 
edge pin. In this case, the processor floats the data bus 
and the device's data is written directly into memory. 



4.3.2.3 Data Chaining 

Each DMA. channel can be programmed in a data 
chaining mode. In this mode, all transfer information is 
taken from a linked -list descriptor in memory (Figure 
4-6). Data chaining is started by specifying a pointer to 
a descriptor in memory. The transfer continues until 



the number of bytes in the byte count field in the de- 
scriptor is transferred. At this time, another linked-list 
descriptor may be executed. The next descriptor is 
specified by the next-pointer field in the current de- 
scription. Data chaining continues until a null pointer 
is encountered in the next-pointer field. Data chaining 
can be designated as source chaining, destination chain- 
ing, or both. 

In data chaining mode, an option exists which allows 
chaining descriptors to be updated while the DMA is 
running. When this option is enabled, the DMA sets a 
bit in the DMA's special function register after loading 
a descriptor and then checks this bit before loading the 
next descriptor. If the bit has been cleared by the user, 
the DMA continues; otherwise, the DMA waits for the 
next descriptor to be set up and for the user to clear the 
bit. An interrupt can be generated when each buffer is 
complete or when the DMA is terminated with a null 
pointer or the EOP pin. 

4.3.3 TRANSFER CHARACTERISTICS 

The DMA controller provides the programmer with a 
number of options for configuring the characteristics of 
a DMA transfer. Intelligent selection of transfer char- 
acteristics works to balance DMA performance and 
functionality with performance of the user program 
when the DMA is in progress. 
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The DMA controller provides features to optimize 
transfers by moving a maximum amount of data for 
each bus request issued. This is controlled by specifying 
the width of the source and destination directed bus 
requests for a DMA transfer, and by on-chip assembly 
or disassembly of the transfer when source and destina- 
tion are not of equal widths. 

Data alignment is performed automatically by the 
DMA controller when the source and destination of a 
transfer are not aligned. The alignment algorithm is 
optimized for many transfers, providing a performance 
comparable to the aligned transfer cases. 

4.3.3.1 Transfer Data Length 

The transfer data length specifies the length of bus re- 
quests directed to the source and destination in a stan- 
dard DMA transfer. Byte, short, word, or quad-word 
loads and stores are selected for either source or desti- 
nation when a DMA channel is set up. Assembly and 
disassembly of data is automatically performed when 
the source and destination widths are different. This 
feature provides the most efficient use of the bus when 
DMA transfers occur between a source and a destina- 
tion with different external bus widths. 



The DMA controller provides the option of using quad 
word transfers to enhance DMA performance. When 
quad transfers are specified, the DMA will request a 
four-word load request and four-word store request for 
each DMA transfer. The trade-off for the added DMA 
performance is latency on the external bus, preventing 
requests by the core, or by another DMA channel from 
being immediately serviced. 

4.3.3.2 Data Alignment 

The DMA controller supports transfer of source and 
destination data aligned to different byte boundaries in 
memory. The DMA implements microcode algorithms 
to transfer some non-aligned data with a performance 
level approaching that for aligned transfers. The DMA 
accomplishes this by attempting to i^sue the maximum 
number of aligned bus requests during a DMA (Figure 
4-7). As shown, most of the overhead due to non- 
aligned DMAs is incurred at the beginning and end of 
the DMA. DMAs with low byte counts, therefore, do 
not benefit as much from the data alignment features of 
the DMA. The alignment feature is optimized for 8-bit 
to 8-bit, 32-bit to 32-bit and for 8-bit and 32-bit combi- 
nations of source and destination lengths. 
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4.3.3.3 Channel Priority 

The DMA controller arbitrates the priority of the 4 
DMA channels. If multiple DMA channels are en- 
abled, the DMA controller will determine in which or- 
der each channel is serviced. 

The DMA controller can be configured in one of two 
priority modes, fixed mode or rotating mode. The fixed 
mode assumes a fixed priority for each channel with 
channel having the highest priority, followed by chan- 
nels 1, 2, and 3, with channel 3 having the lowest prior- 
ity. The rotating mode updates a channel's priority to 
the lowest priority after that channel's DMA is made. 
This insures that a single channel is never locked out by 
other active channels. The priority sequence is always 
in the same order, with priority rotating from the low 
channel numbers to the high channel numbers. 



4.3.3.4 Performance and Latency 
Considerations 

DMA operations and the user program share the re- 
sources of the core and of the external bus. DMA per- 
formance and the performance of the user program are 
coupled directly to the balance of load sharing between 
these two processes. The core resources necessary to 
perform a DMA transfer vary depending on the way a 
channel has been configured. For example, byte assem- 
bly and disassembly requires more processor overhead 
per byte of transfer than does a transfer in which the 
source and destination transfer lengths are equal. The 
performance of a DMA is also tightly coupled to the 
user program's use of the external bus. If the user pro- 
gram does not make frequent, bus requests, the requests 
by the DMA controller will be serviced with little or no 
delay. 

The user can enhance performance of the DMA with 
trade-Offs in system complexity and flexibility. Aligned 
transfers eliminate the microcode overhead needed to 
perform the internal alignments. DMAs between re- 
gions of equal transfer widths eliminate overhead for 



assembly and disassembly. Source or destination mem- 
ory configured as burst memory will provide the most 
efficient use of the DMA controller when the quad- 
transfer feature is enabled. Using the fly-by mode re- 
duces the number of bus requests needed for a DMA 
since fly-by mode uses only a single load or a single 
store request for each transfer. 

4.3.4 DMA CONTROL AND CONFIGURATION 

The DMA Controller uses an SFR register, the DMA 
command (DMAC) register, and the setup DMA 
(sdma) instruction for configuration and control of a 
DMA. The sdma instruction is used to configure each 
DMA channel. Transfer widths, byte count, source and 
destination addresses for a DMA are specified in this 
instruction. 

The DMAC register (Figure 4-8) is described below. 

The channel enable field enables a DMA once the 
channel is set up. Clearing these bits will also cause a 
DMA transfer to be suspended. 

The terminal count field signals that byte count has 
reached zero and a DMA has ended. 

The channel active field indicates that a channel is idle 
or active. If set, this bit indicates that the channel is 
active. This implies that the channel is servicing a 
. transfer or has a request pending. The active bits are 
status information only. 

The channel done field indicates that a DMA operation 
is complete. The done bits are status information only. 

The channel wait field is used for handshaking with a 
user program in data chaining mode. The DMA sets 
these bits when a new linked-list descriptor is read. The 
DMA will not read the next descriptor until this bit is 
cleared by the user. The user can set up the next de- 
scriptor and then clear the channel wait bits to dynami- 
cally change descriptors. 
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A priority mode bit selects rotating or fixed priority 
mode. 

The throttle bit selects the maximum amount of core 
resources that the DMA microcode will receive in rela- 
tion to the execution of the user program. 

4.3.5 DMA INTERRUPTS 

The DMA controller is the source of 4 hardware inter- 
rupts in the 80960CA. The DMA Controller can be 
programmed to request an interrupt when a DMA is 
complete, or when a buffer transfer is completed in 
chaining mode. Each channel requests a different inter- 
rupt. 



troller allows the 8 interrupt pins to be configured as 
dedicated inputs capable of requesting 8 interrupts, or 
as a vectored input capable of requesting up to 248 
interrupts. The NMI pin is always a dedicated input. 
The interrupt controller pins are described below. 

X INT7- External Interrupts (inputs) — These pins 
XINTO can be used as dedicated inputs, or acting 
together as an 8-bit number, request any in- 
terrupt. The inputs are edge or level detect- 
ed, and are optionally debounced internally. 

NMI Non-Maskable Interrupt (input) — NMI re- 

quests the highest priority interrupt. NMI 
is always taken and is not maskable (as the 
name implies), and not interruptable. 



4,4 Interrupt Controller 

The 80960CA Interrupt Controller manages interrupts 
which are requested by external agents or by the DMA 
Controller. The interrupt controller manages 4 internal 
DMA interrupt sources, a single NMI (Non-Maskable 
Interrupt) pin, and 8 external interrupt pins. Up to 248 
external interrupt sources can be supported by the in- 
terrupt controller. The interrupt controller handles the 
prioritization of software interrupts, hardware inter- 
rupts, and the process priority, and signals the core 
when interrupts are to be serviced. The interrupt con- 
troller provides the low-latency interrupt service fea- 
tured on the 80960CA. 



4.4.1 EXTERNAL INTERRUPTS 

The 80960CA provides 8 interrupt pins and one NMI 
pin for detecting external requests. The interrupt con- 



4.4.2 INTERRUPT MODES 

The 8 external interrupt pins can be configured in one 
of three modes: dedicated mode, expanded mode, or 
mixed mode (Figure 4-9). 



4.4.2.1 Dedicated Mode Interrupts 

In dedicated mode, each of the 8 interrupt pins acts as a 
dedicated input. When an external event is detected on 
an interrupt pin, a unique interrupt is requested for that 
pin. It is possible to map each dedicated pin to one of a 
number of possible interrupt vectors. This is accom- 
plished by programming the interrupt map (IMAP) 
control registers with an interrupt vector number for 
each pin. (Recall that interrupt vector numbers are 
8-bit values which reference the 248 vectors in the in- 
terrupt table.) 
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Only the upper four bits of the vector number can be 
programmed for a dedicated mode interrupt. The lower 
four bits are fixed at the value OOIO2. With four pro- 
grammable bits, one of 15 interrupt vectors is available 
for each dedicated pin. These interrupt vectors span the 
even priority levels from priority 2 to 30. The vector at 
priority is not defined. 

The 15 interrupt vectors available to dedicated sources 
can be cached in internal data RAM. If this interrupt 
vector caching feature is selected, the processor will au- 
tomatically fetch the vector from data RAM, eliminat- 
ing the latency caused by a bus request for a vector in 
external memory. 

The DMA Controller can request four interrupts to sig- 
nal the end of a DMA for each of four channels. The 
four interrupt signals from the DMA are handled by 
the interrupt controller in the same way as an interrupt 
pin configured as a dedicated input. Each of the four 
DMA sources may request one of 1 5 interrupts by pro- 
gramming the IMAP for that source. 

4.4.2.2 Expanded Mode Interrupts 

In expanded mode, e x ternal h ardware considers the in- 
terrupt pins (XINT0-XINT7) as an 8-bit binary num- 
ber. This number is used directly as the interrupt vector 
number. Each of the 248 possible interrupt vectors can 
be referenced in this way, allowing a separate external 
source for each vector. External hardware is responsi- 
ble for recognizing individual hardware sources and 
then driving the interrupt vector number corresponding 
to that source onto the interrupt pins. 

4.4.2.3 Mixed Mode Interrupts 

In mixed mode, the 8 interrupt pins are divided into 
two functional sets. One set functions in dedicated 



mode, the other in expanded mode. In mix ed mode , 
three p ins are dedicated interrupt pins (XINT7- 
XINT5). A programmable vector number is associated 
with each of t hese pin s. The remaining five interrupt 
pins (XINT4-XINT0) are treated as the most signifi- 
cant five bits of the expanded mode vector number. The 
lower order bits are internally forced to OIO2 to form 
the full 8 -bit value for the vector number. 



4.4.3 INTERRUPT CONTROLLER SETUP 

The interrupt controller uses two special function regis- 
ters to manage interrupt requests by hardware sources. 
The hardware interrupt pending register (IPND) and 
the hardware interrupt mask register (IMSK) are ad- 
dressed as sfO and sf 1 respectively. A single bit in each 
register corresponds to each of the 8 possible external 
sources and 4 DMA sources for hardware interrupts. 
The IMSK register performs the function of masking 
hardware interrupts and the IPND register implements 
posting of interrupts requested by hardware. When 
configured for expanded or mixed mode interrupts, bit 
of the IMSK register globally masks the expanded 
mode interrupts. 



4.4.4. NON-MASKABLE INTERRUPT 

In addition to the maskable hardware interrupts, a sin- 
gle Non-Maskable Interrupt (NMI) is provided. A dedi- 
cated NMI pin is used to request this interrupt. NMI is 
defined as a higher priority than any hardware inter- 
rupt, software interrupt, or process priority. The NMI 
procedure, therefore, can never be interrupted and 
must execute the return instruction before other proce- 
dures can execute. The NMI procedure is entered 
through vector 248. This vector is cached in internal 
data RAM at initialization to reduce latency for the 
NMI. 
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The 80960CA Core is a high-performance implementa- 
tion of the 80960 Core Architecture. This section brief- 
ly describes the microarchitecture of the 80960CA core 
and the key constructs used to achieve parallel instruc- 
tion execution. 

The 80960CA core can be divided into the 6 main sub- 
units listed below. 

— Instruction Sequencer 

— Register File 

— Execution Unit 

— Multiply and Divide Unit 

— Address Generation Unit 

— Static Data RAM and Local Register Cache 

Figure A-l is a simple block diagram of the 80960CA. 
The nucleus of the processor is the Instruction Se- 
quencer and Register File. The other subunits of the 
core, referred to as coprocessors, radiate from these 
units, connecting to either the register (REG) side or 
the memory (MEM) side of the processor. The Instruc- 
tion Sequencer issues directives, via the REG and 
MEM interfaces, which target a specific coprocessor. 
That coprocessor then executes an express function vir- 
tually decoupled from the IS and the other coproces- 



sors. The REG and MEM data busses shown in Figure 
A-l are used to transfer data between the common 
Register File and the coprocessors. 



A.1 Instruction Sequencer 

The Instruction Sequencer (IS) decodes the instruction 
stream and drives the decoded instruction stream onto 
the coprocessor interfaces. In a single clock, the IS de- 
codes up to 4 instruction and issues up to three of these 
instructions to the on-chip coprocessors or to the IS 
itself. One register (REG) format, one memory (MEM) 
format, and one control or control and branch (CTRL 
or COBR) format instruction can be issued at one time. 
These instructions are directed respectively to the REG 
coprocessors, the MEM coprocessors, or to the IS. The 
ability to issue multiple instructions in parallel can re- 
sult in the simultaneous execution of many instructions 
at once. An optimizing compiler or hand optimization 
of assembly code can easily produce an instruction 
stream which takes full advantage of the parallel execu- 
tion of the core. 

A technique known as resource scoreboarding is used to 
manage the parallel execution of instructions and the 
common resources of the processor. A coprocessor, for 
example, can scoreboard itself, indicating that it cannot 
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act on another instruction until an instruction currently 
executing on that coprocessor is completed. A specific 
form of resource scoreboarding is referred to as register 
scoreboarding. When the computation stage of an in- 
struction takes more than one clock, the destination 
register or registers for the result are scoreboarded as 
busy. A subsequent operation needing that particular 
register will be delayed until the multi-clock operation 
is completed. Instructions which do not use the score- 
boarded registers can be executed in parallel. 

The IS manages a three stage parallel instruction pipe- 
line (Figure A-2). In the first stage of the pipeline (pipe 
0), the address of the next instruction is calculated. 
This address may be the next sequential instruction, the 
target of a branch, or a location in microcode. In the 
second stage of the pipeline (pipe 1), the instructions 
are issued to the rest of the machine. In the third stage 
(pipe 2), the instruction computation is started, and for 
single cycle instructions, a result is returned. 

Several microarchitectural features of the core are de- 
signed to minimize performance loss due to pipeline 
breaks. 



ten to its destination register. Bypassing the register file 
saves the one clock cycle break which would otherwise 
occur while waiting for the value to be written to the 
register file and the register scoreboard to be cleared. 

On-chip Cache — The on-chip instruction cache and lo- 
cal register cache eliminate many pipeline breaks which 
will occur if the IS is forced to wait for code or data to 
be moved between the 80960CA and external memory. 

Register File Access — The Register File allows multiple 
instructions to gain access to the register set simulta- 
neously. This eliminates pipeline breaks which would 
be caused by a loss of access to the register set by any 
coprocessor. 

A.1.1 INSTRUCTION CACHE 

The IS includes a 1 Kbyte two-way set associative in- 
struction cache capable of delivering up to four instruc- 
tions each clock to the Instruction Sequencer. The 
cache allows inner loops of code to execute with no 
external instruction fetches. 



Branch Prediction — To minimize pipeline breaks due to 
branching, the user can specify the direction that a con- 
ditional branch instruction will usually follow. The 
processor will execute along the specified instruction 
path with no pipeline break. If the branch direction 
specified was the direction actually selected by execu- 
tion of the conditional branch, no pipeline break oc- 
curs. The direction of the branch guess is determined 
by a bit value in the CTRL format instructions. 

Register Bypassing — Register bypassing is a feature 
which forwards the result of an instruction for immedi- 
ate use as the source of another instruction. This for- 
warding occurs at the same time that the value is writ- 



A.1.2 MICROCODE ROM 

The 80960CA uses microcode ROM to implement com- 
plex instructions and functions. This includes calls, re- 
turns, DMA transfers, and initialization sequences. Mi- 
crocode provides an inexpensive and simple method for 
implementing complex instructions in the mostly RISC 
environment of the 8G960CA. When the IS encounters 
a microcoded instruction, it automatically branches to 
the microcode routine. The 80960CA performs this mi- 
crocode branch in clocks. 
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A.2 Register File 

The Register File (RF) contains the 16 local and 16 
global registers. The register file has six ports (Figure 
A-3), allowing parallel access of the register set by sev- 
eral 80960CA coprocessors. This parallel access results 
in an ability to execute one simple logic or arithmetic 
instruction, one memory operation (load/store), and 
one address calculation per clock. 

MEM coprocessors interface to the RF with a 128-bit 
wide load bus and a 128-bit wide store bus. These bus- 
ses enable movement of up to 4 words per clock to and 
from the RF. These busses also allow LOAD data from 
a previous read access and STORE data from a current 
write access to be processed in the register file simulta- 
neously. An additional 32-bit port allows an address or 
address reduction operand to be simultaneously fetched 
by the Address Generation Unit. 

REG coprocessors interface to the RF with two 64-bit 
source busses and a single 64-bit destination bus. With 
this bus structure, two source operands are simulta- 
neously issued to a REG coprocessor when an instruc- 
tion is issued. A 64-bit destination bus allows the result 
from the previous operation to be written to the RF at 
the same time that the current operation's source oper- 
ands are issued. 



A.3 Execution Unit 

The Execution Unit is the 32-bit Arithmetic and Logic 
Unit of the 80960CA Core. The EU can be viewed as a 
self-contained REG coprocessor with its own instruc- 
tion set. As such, the EU is responsible for executing or 
supporting the execution of all the integer and ordinal 
arithmetic instructions, the logic and shift instructions, 
the move instructions, the bit and bit field instructions, 
and the compare operations. The EU performs any 
arithmetic or logical instructions in a single clock. 



A.4 Multiply Divide Unit 

The Multiply and Divide Unit (MDU) is a REG coproc- 
essor which performs integer and ordinal multiply, di- 
vide, remainder, and modulo operations. The MDU de- 
tects integer overflow and divide by zero errors. The 
MDU is optimized for multiplication, performing 32- 
bit multiplies in 4 clocks. The MDU performs multi- 
plies and divides in parallel with the main execution 
unit. 



A.5 Address Generation Unit 

The Address Generation Unit (AGU) is a MEM coproc- 
essor which computes the effective addresses for memo- 
ry operations. It directly executes the load address in- 
struction (Ida) and calculates addresses for loads and 
stores based on the addressing mode specified in these 
instructions. The address calculations are performed in 
parallel with the main execution unit (EU). 



A.6 Data RAM and Local Register 
Cache 

The Data RAM and Local Register Cache is part of a 
1.5 Kbyte block of on-chip Static RAM (SRAM). 
1 Kbyte of this SRAM is mapped into the 80960CA's 
address space from location 00000000H to 
000003FFH. A portion of the remaining 512 bytes is 
dedicated to the Local Register Cache. This part of 
internal SRAM is not directly visible to the user. Loads 
and Stores, including quad- word accesses, to the inter- 
nal SRAM are typically performed in only one clock. 
The complete local register set, therefore, can be moved 
to the local register cache in only four clocks. 
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80960CA-33,-25,-16 
32-BIT HIGH PERFORMANCE EMBEDDED PROCESSOR 

• Two Instructions/Clock Sustained Execution 

• Four 59 Mbytes/s DMA Channels with Data Chaining 

• Demultiplexed 32-Bit Burst Bus with Pipelining 

32-bit Parallel Architecture 

— Two Instructions/clock Execution 

— Load/Store Architecture 

— 16, 32-bit Global Registers 
— 16, 32-bit Local Registers 

— Manipulate 64-Bit Bit Fields 

— 1 1 Addressing Modes 

— Full Parallel Fault Model 

— Supervisor Protection Model 

Fast Procedure Call/Return Model 

— Full Procedure Call in 4 clocks 

— RISC Call in 2 clocks (BAL) 

On-Chip Register Cache 

— Caches Registers on Call/Ret 

— Minimum of 6 Frames provided 

— Number of Frames Programmable, 
up to 15 

On-Chip Instruction Cache 

— 1 Kbyte Two-Way Set Associative 

— 128-bit Path to Instruction Sequencer 

— Cache-Lock Modes 

— Cache-Off Mode 



High Bandwidth On-Chip Data Ram 

— 1 Kbytes On-chip RAM for Data 

— Sustain 128-bits per clock access 

Four On-Chip DMA Channels 

— 59 Mbytes/s Fly-by Transfers 

— 32 Mbytes/s Two-Cycle Transfers 

— Data Chaining 

— Data Packing/Unpacking 

— Programmable Priority Method 

32-Bit Demultiplexed Burst Bus 

— 128-Bit Internal Data Paths to and 
from Registers 

— Burst Bus for DRAM Interfacing 

— Address Pipelining Option 

— Fully Programmable Wait States 

— Supports 8, 16 or 32-bit Bus Widths 

— Supports Unaligned Accesses 
— Supervisor Protection Pin 

High-Speed Interrupt Controller 

— Up to 248 External Interrupts 

— 32 Fully Programmable Priorities 

— Multi-mode 8-bit Interrupt Port 

— Four Internal DMA Interrupts 

— Separate, Non-maskable Interrupt Pin 

— Context Switch in 750 ns Typical 




Figure 1. 80960CA Die Photo 
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1.0 PURPOSE 

This document provides a preview of the electrical 
characteristics expected of the 33, 25 and 16 MHz 
versions of the 80960CA. For a detailed description 
of any 80960CA functional topic, other than para- 
metric performance, consult the latest 80960CA 
Product Overview (Order No. 270669), or the 
80960CA User's Manual (Order No. 270710). 



2.0 80960CA OVERVIEW 

The 80960CA is the second-generation member of 
the 80960 Family of embedded processors. The 
80960CA is object code compatible with the 32-bit 
80960 Core Architecture while including Special 
Function Register extensions to control on-chip pe- 
ripherals, and instruction set extensions to shift 64- 
bit operands and configure on-chip hardware. Multi- 
ple 128-bit internal busses, on-chip instruction cach- 
ing and a sophisticated instruction scheduler allow 
the processor to sustain execution of two instruc- 
tions every clock, and peak at execution of 3 instruc- 
tions per clock. 



A 32-bit demultiplexed and pipelined burst bus pro- 
vides a 132 Mbyte/s bandwidth to a system's high- 
speed external memory sub-system. In addition, the 
80960CA's on-chip caching of instructions, proce- 
dure context and critical program data substantially 
decouples system performance from the wait states 
associated with accesses to the system's slower, 
cost sensitive, main memory sub-system. 

The 80960CA bus controller also integrates full wait 
state and bus width control for highest system per- 
formance with minimal system design complexity. 
Unaligned access and Big Endian byte order support 
reduces the cost of porting existing applications to 
the 80960CA. 

The processor also integrates four complete data- 
chaining DMA channels and a high-speed interrupt 
controller on-chip. The DMA channels perform: sin- 
gle-cycle or two-cycle transfers, data packing and 
unpacking, and data chaining. Block transfers, in ad- 
dition to source or destination synchronized trans- 
fers are provided. 

The interrupt controller provides full programmability 
of 248 interrupt sources into 32 priority levels with a 
typical interrupt task switch ("latency") time of 
750 ns. 




INTERRUPT 
PORT 



PROGRAMMABLE 
INTERRUPT CONTROLLER 



MULTIPLY/DIVIDE 
UNIT 



EXECUTION 
UNIT 



INSTRUCTION PREFETCH QUEUE 



INSTRUCTION CACHE 
(1 K byte, Two-way set associative) 



PARALLEL 
INSTRUCTION SCHEDULER 



Register-side 



Machine Bus 



Memory-side 



Machine Bus 




64-BIT 

SRC1 BUS 

I 

64-BIT 

SRC2 BUS 

I 

64-BIT 

' DST BUS 



SIX-PORT 
REGISTER FILE 



FOUR-CHANNEL 
DMA CONTROLLER 



Memory Region 
Configuration 



BUS CONTROLLER 



HHBMI 



1 KBYTE DATA RAM 



5-15-SET 
REGISTER CACHE 



mm a mm -ZL 




Figure 2. 80960CA Block Diagram 
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2.1. The C-Series Core 

The C-Series core is a very high performance micro- 
architectural implementation of the 80960 Core Ar- 
chitecture. The C-Series core can sustain execution 
of two instructions per clock (66 M IPs at 33 MHz). 
To achieve this level of performance, Intel has incor- 
porated state-of-the-art silicon technology and inno- 
vative microarchitectural constructs into the imple- 
mentation of the C-Series core. Factors that contrib- 
ute to the core's performance include: 

— Parallel instruction decoding allows issue of up 
to three instructions per clock. 

— Most instructions execute in a single clock. 

— Parallel instruction decode allows sustained, 
simultaneous execution of two single-clock in- 
structions every clock cycle. 

— Efficient instruction pipeline is designed to mini- 
mize pipeline break losses. 

— Register and resource scoreboarding allow 
simultaneous multi-clock instruction execution. 

— Branch look-ahead and prediction allows many 
branches to execute with no pipeline break. 

— Local Register Cache integrated on-chip caches 
Call/Return context. 

— Two-way set associative, 1 Kbyte integrated in- 
struction cache 

— 1Kbyte integrated Data RAM sustains a four- 
word (128-bit) access every clock cycle. 



2.2. Pipelined, Burst Bus 

A 32-bit high performance bus controller interfaces 
the 80960CA to external memory and peripherals. 
The Bus Control Unit features a maximum transfer 
rate of 132 Mbytes per second (at 33 MHz). Internal- 
ly programmable wait states and 16 separately con- 
figurable memory regions allow the processor to in- 
terface with a variety of memory subsystems with a 
minimum of system complexity, and a maximum of 
performance. The Bus Controller's main features in- 
clude: 



— Demultiplexed, Burst Bus to exploit most efficient 
DRAM access modes 

— Address Pipelining to reduce memory cost while 
maintaining performance 

— 32-, 16- and 8-bit modes for I/O interfacing ease. 

— Full internal wait state generation to reduce sys- 
tem cost 

— Little and Big Endian support to ease application 
development 

— Unaligned access support for code portability 

— Three-deep request queue to decouple the bus 
from the core 

— Direct interface to Intel's 27C960 Burst EPROM 
and 82596 Ethernet Controller. 



2.3. Flexible DMA Controller 

A four channel DMA controller provides high speed 
DMA control for data transfers involving peripherals 
and memory. The DMA provides advanced features 
such as data chaining, byte assembly and disassem- 
bly, and a high performance fly-by mode capable of 
transfer speed of up to 59 Mbytes per second at 
33 MHz. The DMA controller features a performance 
and flexibility which is only possible by integrating 
the DMA controller and the 80960CA core. 



2.4. Priority Interrupt Controller 

A programmable-priority interrupt controller man- 
ages up to 248 external sources through the 8-bit 
external interrupt port. The Interrupt Unit also han- 
dles the 4 internal sources from the DMA controller, 
and a single non-maskable interrupt input. The 8-bit 
interrupt port can also be configured to provide indi- 
vidual interrupt sources that are level, or edge trig- 
gered. 

Interrupts in the 80960CA are prioritized and sig- 
naled within 270 ns of the request. If the interrupt is 
of higher priority than the processor priority, the con- 
text switch to the interrupt routine typically is com- 
plete in another 480 ns. The interrupt unit provides 
the mechanism for the low latency and high through- 
put interrupt service which is essential for embedded 
applications. 
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2.5. Instruction Set Summary 

The following table summarizes the 80960CA instruction set by logical groupings. See the 80960CA User's 
Manual for a complete description of the instruction set. 



Data 
Movement 


Arithmetic 


Logical 


Bit, Bit Field 
and Byte 


Load 
Store 
Move 
Load Address 


Add 

Subtract 

Multiply 

Divide 

Remainder 

Modulo 

Shift 

* Extended 

Shift 
Extended 

Multiply 
Extended 

Divide 
Add with 

Carry 
Subtract with 

Carry 
Rotate 


And 

Not And 
And Not 
Or 

Exclusive Or 
Not Or 
Or Not 
Nor 

Exclusive Nor 
Not 
Nand 


Set Bit 

Clear Bit 

Not Bit 

Alter Bit 

Scan for Bit 

Span over Bit 

Extract 

Modify 

Scan Byte for Equal 


Comparison 


Branch 


Call and Return 


Fault 


Compare 
Conditional 

Compare 
Compare and 

Increment 
Compare and 

Decrement 
Condition Test 
Check Bit 


Unconditional 

Branch 
Conditional 

Branch 

Compareand 
Branch 


Call 

Call Extended 

Call System 

Return 

Branch and Link 


Conditional 

Fault 
Synchronize 

Faults 


Debug 


Processor 
Management 


Atomic 




Modify Trace 

Controls 
Mark 
Force Mark 


Modify 

Process 

Controls 
Modify 

Arithmetic 

Controls 

* System Control 

* DM A Control 
Flush Local 

Registers 


Atomic Add 
Atomic Modify 





NOTE: 

Instructions marked by (*) are 80960CA extensions to the 80960 instruction set. 
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3.0 PACKAGE INFORMATION 



3.1. Package Introduction 

This section describes the pins, pinouts and thermal 
characteristics for the 80960CA in the 168-pin Ce- 
ramic Pin Grid Array (PGA) package and the 1 96 pin 
Plastic Quad Flat Package (PQFP). For complete 
package specifications and information, see the Intel 
Packaging Specification (Order # 231369). 



3.2. Pin Descriptions 

The 80960CA pins are described in this section. Ta- 
ble 1 presents the legend for interpreting the pin de- 
scriptions in the following tables. 

The pins associated with the 32-bit demultiplexed 
processor bus are described in Table 2. The pins 
associated with basic processor configuration and 
control are described in Table 3. The pins associat- 
ed with the 80960CA DMA Controller and Interrupt 
Unit are described in Table 4. 

Figure 3 provides an example pin description table 
entry. The "I/O" signifies that the data pins are in- 
put-output. The "S" indicates the pins are synchro- 
nous to PCLK2:1. The "H(Z)" indicates that these 
pins float while the processor bus is in a Hold Ac- 
knowledge state. The " R(Z)" no tation indicates that 
the pins also float while RESET is low. 

All pins float while the processor is in the ONCE™ 
mode. 



Table 1. Pjn Description Nomenclature 



Symbol 


Description 


I 


Input only pin 





Output only pin 


I/O 


Pin can be either an input or output 


- 


Pins "must be" connected as 
described 


S(...) 


Synchronous. Inputs must meet setup 
and hold times relative to PCLK2:1 for 
proper operation of the processor. All 
outputs are synchronous to PCLK2:1 . 
S(E) Edge sensitive input 
S(L) Level sensitive input 


A(...) 


Asynchronous. Inputs may be 
asynchronous to PCLK2:1 . 
A(E) Edge sensitive input 
A(L) Level sensitive input 


H(...) 


While the processor's bus is in the 
Hold Acknowledge or Bus Backoff 
state, the pin: 

H(1) is driven to Vcc 

H(0) is driven to Vss 

H(Z) floats 

H(Q) continues to be a valid output 


R(...) 


While the processor's RESET pin is 
low, the pin 

R(1) is driven to Vcc 

R(0) is driven to Vss 

R(Z) floats 

R(Q) continues to be a valid output 



Name 


Type 


Description 


D31:0 


I/O 

S(L) 
H(Z) 
R(Z) 


DATA BUS carries 32, 16 or 8-bit data quantities depending on bus width configuration. The 
least significant bit of the data is carried on DO and the most significant on D31 . When the 
bus is configured for 8 bit data, the lower 8 data lines, D7:0 are used. For 16 bit data widths, 
D1 5:0 are used. For 32 bit data the full data bus is used. 



Figure 3. Example Pin Description Entry 
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Table 2. 80960CA Pin Description— External Bus Signals 



Name 


Type 


Description 


A31:2 


O 

S 
H(Z) 
R(Z) 


ADDRESS BUS carries the upper 30 bits of the physical address. A31 is the most 
significant address bit and A2 is the least significant. During a bus access, A31 :2 
identify all external addresses to word (4-byte) boundaries. The byte enable 
signals indicate the selected byte in each word. During burst accesses, A3 and A2 
increment to indicate successive data cycles. 


D31:0 


I/O 

S(L) 
H(Z) 
R(Z) 


DATA BUS carries 32, 16 or 8-bit data quantities depending on bus width 
configuration. The least significant bit of the data is carried on DO and the most 
significant on D31 . When the bus is configured for 8 bit data, the lower 8 data 
lines, D7:0 are used. For 1 6 bit bus widths, D1 5:0 are used. For 32 bit bus widths 
the full data bus is used. 


BE3 
BE2 
BE1 
BEO 




s 

H(Z) 
R(l) 


BYTE ENABLES select which of the four bytes addressed by A31:2 are active 
during an access to a memory region configured for a 32-bit data-bus width. BE3 
applies to D31:24; BE2 applies to D23:16; BE1 applies to D15:8; and BEO applies 
toD7:0. 

32-bit bus: BE3 -Byte Enable 3 -enable D31 :24 
BE2 -Byte Enable 2 -enable D23:16 
BET -Byte Enable 1 -enable D1 5:8 
BEO -Byte Enable -enable D7:0 

For accesses to a memory region configured for a 1 6-bit data-bus width, the 
processor directly encodes BE3, BE1 and BEO to provided BHE, A1 and BLE 
respectively. 

16-bit bus: BE3 -Byte High Enable (BHE) -enable D15:8 
BE2 -Not used (is driven high or low) 
BET -Address Bit 1 (A1) 
BEO -Byte Low Enable (BLE) -enable D7:0 

For accesses to a memory region configured for an 8-bit data bus width, the 
processor directly encodes BE1 and BEO to provide A1 and A0 respectively. 

8-bit bus: BE3 -Not used (is driven high or low) 
BE2 -Not used (is driven high or low) 
BET -Address Bit 1 (A1) 
BEO -Address Bit (A0) 


W/R 




s 

H(Z) 
R(0) 


WRITE/READ is low (0) for read requests and high (1) for write requests. The 
W/R signal changes in the same clock cycle as ADS. It remains valid for the entire 
access in non-pipelined regions. In pipelined regions, W/R may not be valid in the 
last cycle of a read access. 


ADS 




s 

H(Z) 
R(D 


ADDRESS STROBE indicates valid address and the start of a new bus access. 
ADS is asserted for the first clock of a bus access. 


READY 


S(L) 
H(Z) 
R(Z) 


READY is an input which signals the termination of a data transfer. READY is 
used to indicate that read data on the bus is valid, or that a write-data transfer has 
completed. The READY signal works in conjunction with the internally 
programmed wait-state generator. If READY is enabled in a region, the pin is 
sampled after the programmed number of wait-states has expired. If the READY 
pin is deasserted high, wait states will continue to be inserted until READY 
becomes asserted low. This is true for the Nrad. n rdd> Nwad» and n wdd wait 
states. The Nxda wait states cannot be extended. 
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Table 2. 80960CA Pin Description— External Bus Signals (Continued) 



Name 


Type 


Description 


BTERM 


I 

S(L) 
H(Z) 
R(Z) 


BURST TERMINATE — The burst terminate signal breaks up a burst access and 
causes another address cycle to occur. The BTERM signal works in conjunction 
with the internally programmed wait-state generator. If READY and BTERM are 
enabled in a region, the BTERM pin is sampled after the programmed number of 
wait states has expired. When BTERM is asserted, additional wait states are 
inserted until BTERM is deasserted. When BTERM is deasserted, a new ADS 
signal is generated and the access is completed. The READY input is ignored 
when BTERM is asserted. BTERM must be externally synchronized to satisfy the 
BTERM setup and hold times. 


WAIT 




s 

H(Z) 
R(1) 


WAIT indicates the status of the internal wait state generator. WAIT is active 
when wait states are being caused by the internal wait state generator and not by 


the READY or BTERM inputs. WAIT can be used to derive a write-data strobe. 
WAIT can also be thought of as a READY output that the processor provides 
when it is inserting wait states. 


BLAST 




s 

H(Z) 
R(0) 


BURST LAST indicates the last transfer in a bus access. BLAST is asserted in the 
last data transfer of burst and non-burst accesses after the wait state counter 


reaches zero. BLAST remains active until the clock following the last cycle of the 


last data transfer of a bus access. If the READY or BTERM input is used to extend 
wait states, the BLAST signal remains active until READY or BTERM terminates 
the access. 


DT/R 




s 

H(Z) 

R(0) 


DATA TRANSMIT/RECEIVE indicates direction for data transceivers. DT/R is 
used in conjunction with DEN to provide control for data transceivers attached to 
the external bus. When DT/R is low (0), the signal indicates that the processor will 
receive data. Conversely, when high (1) the processor will send data. DT/R will 
change only while DEN is high. 


DEN 




s 

H(Z) 
R(D 


DATA ENABLE indicates data cycles in a bus access. DEN is asserted (low) at 
the start of the first data cycle of a bus request and is deasserted (high) at the end 
of the last data cycle. DEN is used in conjunction with DT/R to provide control for 
data transceivers attached to the external bus. DEN remains asserted for_ 
sequential reads from pipelined memory regions. DEN is high when DT/R 
changes. 


LOCK 




s 

H(Z) 
R(D 


BUS LOCK indicates that an atomic read-modify-write operation is in progress. 
LOCK may be used to prevent external agents from accessing memory which is 


currently involved in an atomic operation. LOCK is asserted (0) in the first clock of 
an atomic operation, and deasserted in the clock cycle following the last bus 
access for the atomic operation. To allow the most flexibility for a memory system 
enforcement of locked accesses, the processor will acknowledge a bus hold 
request when LOCK is asserted. The processor will perform DMA transfers while 
LOCK is active. 


HOLD 


S(L) 
H(Z) 
R(Z) 


HOLD REQUEST signals that an external agent requests access to the external 
bus. The processor asserts HOLDA after completing the current bus request. 
HOLD, HOLDA, and BREQ are used together to arbitrate access to the 
processor's external bus by external bus agents. 


BOFF 


I 

S(L) 
H(Z) 
R(Z) 


BOFF BUS BACKOFF —The backoff pin, when asserted (0), suspends the 
current access and causes the bus pins to float. When the pin is deasserted (1 ), 
the ADS signal is asserted on the next clock cycle and the access is resumed. 
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Table 2. 80960CA Pin Description— External Bus Signals (Continued) 



Name 


Type 


Description 


HOLDA 




S 
H(1) 
R(Q) 


HOLD ACKNOWLEDGE indicates to a bus requestor that the processor has 
relinquished control of the external bus. When HOLDA is asserted, the external 
address bus, data bus, and bus control signals are floated. HOLD, BOFF, HOLDA 
and BREQ are used together to arbitrate access to the processor's external bus 
by external bus agents. Since the processor will grant HOLD requests and enter 
the Hold Acknowledge state even while RESET is active, the state of the HOLDA 
pin will be independent of the RESET pin. 


BREQ 




s 

H(Q) 
R(0) 


BUS REQUEST indicates that the processor wishes to perform a bus request. 
BREQ can be used by external bus arbitration logic in conjunction with HOLD and 
HOLDA to determine when to return mastership of the external bus to the 
processor. 


D/C 




s 

H(Z) 
R(Z) 


DATA OR CODE indicates that a bus request is a data request (1) or a instruction 
request (0). D/C has the same timing as W/R 


DMA 




s 

H(Z) 
R(Z) 


DMA ACCESS indicates whether the bus request was initiated by the DMA 
controller. DMA will be asserted (low) for any DMA request. DMA will be 
deasserted (high) for all other requests. 


SUP 




s 

H(Z) 
R(Z) 


SUPERVISOR ACCESS indicates whether the bus request is issued while in 
supervisor mode. SUP will be asserted (low) when the request has supervisor 
privileges, and will be deasserted (high) otherwise. SUP can be used to isolate 
supervisor code and data structures from non-supervisor requests. 



Table 3. 80960CA Pin Description— Processor Control Signals 



Name 


Type 


Description 


RESET 


I 

A(L) 
H(Z) 
R(Z) 
N(Z) 


RESET causes the chip to reset. When RESET is asserted (low), all external signals 
return to the reset state. When RESET is deasserted, initialization begins. When the 
two-x clock mode is selected, RESET must remain asserted for 16 PCLK2:1 cycles 
before being deasserted in order to guarantee correct initialization of the processor. 


When the one-x clock mode is selected, RESET must remain asserted for 10,000 
PCLK2:1 cycles before being deasserted in order to guarantee correct initialization of 
the processor. The CLKMODE pin selects one-x or two-x input clock division of the 
CLKINpin. 

The processor's Hold Acknowledge bus state functions while the chip is reset. If the 
processor's bus is in the Hold Acknowledge state when RESET is activated, the 
processor will internally reset, but will maintain the Hold Acknowledge state on 
external pins until the Hold request is removed. If a hold request is made while the 
processor is in the reset state, the processor bus will grant HOLDA and enter the Hold 
Acknowledge state. 


FAIL 




s 

H(Q) 
R(0) 


FAIL indicates failure of the processor's self-test performed at initialization. When 
RESET is deasserted and the processor begins initialization, the FAIL pin is asserted 
(0). An internal self-test is performed as part of the initialization process. If this self-test 
passes, the FAIL pin is deasserted (1 ) otherwise it remains asserted. The FAIL pin is 
reasserted while the processor performs and external bus self-confidence test. If this 
self-test passes, the processor deasserts the FAIL pin and branches to the users 
initialization routine, otherwise the FAIL pin remains asserted. Internal self-test and the 
use of the FAIL pin can be disabled with the STEST pin. 
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Table 3. 80960CA Pin Description— Processor Control Signals (Continued) 



Name 


Type 


Description 


STEST 


I 

S(L) 
H(Z) 
R(Z) 


SELF TEST causes the processor's internal self-test feature to be enabled or 
disabled at initialization. STEST is read on the rising edge of RESET. When asserted 
(high) the processor's internal self-test and external bus confidence tests are 
performed during processor initialization. When deasserted (low), only the internal 
self-test is not performed during initialization. 


ONCE™ 


I 
A(L) 
H(Z) 
R(Z) 


ON CIRCUIT EMULATION causes all outputs to be floated when asserted (low). 
ONCE is continuously sampled while RESET is low, and is latched on the rising edge 
of RESET. To place the processor in the ONCE state: 

(1) assert RESET and ONCE (order does not matter) 

(2) wait for at least 1 6 CLKIN periods in two-x mode, or 1 0,000 CLKIN periods in 
one-x mode, after Vcc and CLKIN are within operating specifications 

(3) deassert RESET 

(4) wait at least 32 CLKIN periods 


(The processor will now be latched in the ONCE state as long as RESET is high.) 

To exit the ONCE state, bring Vcc an d CLKIN to operating conditions, then assert 
RESET and bring ONCE high prior to deasserting RESET. 

CLKIN must operate within the specified operating conditions of the processor until 
step 4 above has been completed. The CLKIN may then be changed to DC to 
achieve the lowest possible ONCE mode leakage current. 


ONCE can be used by emulator products or for board testers to effectively make an 
installed processor transparent in the board. 


CLKIN 


A(E) 
H(Z) 
R(Z) 


CLOCK INPUT is an input for the external clock needed to run the processor. The 
external clock is internally divided as prescribed by the CLKMODE pin to produce 
PCLK2:1. 


CLKMODE 


A(L) 
H(Z) 
R(Z) 


CLOCK MODE selects the division factor applied to the external clock input (CLKIN). 
When CLKMODE is high (1), CLKIN is divided by one to create PCLK2:1 and the 
processor's internal clock. When CLKMODE is low (0), CLKIN is divided by two to 
create PCLK2:1 and the processor's internal clock. CLKMODE should be tied high, or 
low in a system, as the clock mode is not latched by the processor. If left 
unconnected, the processor will internally pull the CLKMODE pin low (0), enabling the 
two-x clock mode. 


PCLK2 
PCLK1 




s 

H(Q) 
R(Q) 


PROCESSOR OUTPUT CLOCKS provide a timing reference for all inputs and 
outputs of the processor. All inputs and output timings are specified in relation to 
PCLK2 and PCLK1. PCLK2 and PCLK1 are identical signals. Two output pins are 
provided to allow flexibility in the system's allocation of capacitive loading on the 
clock. PCLK2:1 may also be connected at the processor to form a single clock signal. 


v S s 


- 


GROUND connections consist of 24 pins which must be connected externally to a 
Vss board plane. 


v C c 


- 


POWER connections consist of 24 pins which must be connected externally to a Vcc 
board plane. 


N/C 


. - 


NO CONNECT pins must not be connected in a system. 
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Table 4. 80960CA Pin Description-— DMA and Interrupt Unit Control Signals 



Name 


Type 


Description 


DREQ3 
DREQ2 
DREQ1 
DREQO 


I 
A(L) 
H(Z) 
R(Z) 


DMA REQUEST causes a DMA transfer to be requested. Each of the four signals 
requests a transfer on a single channel. DREQO requests channel 0, DREQ1 
requests channel 1 , etc. When two or more channels are requested simultaneously, 
the channel with the highest priority is serviced first. The channel priority mode is 
programmable. 


DACK3 
DACK2 
DACK1 
DACKO 




s 

H(1) 
R(1) 


DMA ACKNOWLEDGE indicates that a DMA transfer is being executed. Each of the 
four signals acknowledges a transfer for a single channel. DACKO acknowledges 
channel 0, DACK1 acknowledges channel 1 , etc. DACK3:0 are active (0) when the 
requesting device of a DMA is accessed. 


EOP3/TC3 
EOP2/TC2 
EOP1/TC1 
EOPO/TCO 


I/O 

A(L) 

H(Z/Q) 

R(Z) 


END OF PROCESS/TERMINAL COUNT can be programmed as either an input 
(EOP3:0) or as an output (TC3:0), but not both. Each pin is individually 
programmable. When programmed as an input, EOPx causes the termination of a 
current DMA transfer for the channel corresponding to the EOPx pin. EOP0 


corresponds to channel 0, EOP1 corresponds to channel 1, etc. When a channel is 
configured for source and destination chaining, the EOP pin for that channel causes 
termination of only the current buffer transferred and causes the next buffer to be 
transferred. EOP3:0 are asynchronous inputs. 

When programmed as an output, the channel's TGx pin indicates that the channel 
byte count has reached and a DMA has terminated. TCx is driven with the same 


timing as DACKx during the last DMA transfer for a buffer. If the last bus request is 
executed as multiple bus accesses, TCx will stay asserted for the entire bus request. 


XINT7 
XINT6 
XINT5 
XINT4 
XINT3 
XINT2 
XINT1 
XINTO 


A(E/L) 
H(Z) 
R(Z) 


EXTERNAL INTERRUPT PINS cause interrupts to be requested. These pins can be 
configured in three modes. 

In the Dedicated Mode, each pin is a dedicated external interrupt source. Dedicated 
inputs can be individually programmed to be level (low) or edge (falling) activated. 
In the Expanded Mode, the 8 pins act together as an 8-bit vectored interrupt source. 
The interrupt pins in this mode are level activated. Since the interrupt pins are active 
low, the vector number requested is the one's complement of the positive logic value 
place on the port. This eliminates glue logic to interface to combinational priority 
encoders which output negative logic. 


In the Mixed Mode, XINT7:5 are dedicated sources and XINT4:0 act as the 5 most 
significant bits of an expanded mode vector. The least significant bits are set to 01 
internally. 


NMl 


I 
A(E) 
H(Z) 
R(Z) 


NON-MASKABLE INTERRUPT causes a non-maskable interrupt event to occur. 
NMl is the highest priority interrupt recognized. NMl is an edge (falling) activated 
source. 
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3.3. 80960CA Pinout 



3.3.1 80960CA CPGA PINOUT 

Tables 5 and 6 list the 80960CA pin names with 
package location. Figure 4-a depicts the complete 



80960CA pinout as viewed from the top side of the 
component (i.e., pins facing down). Figure 4b shows 
the complete 80960CA pinout as viewed from the 
pin-side of the package (i.e., pins facing up). See 
Section 4.0, Electrical Specifications for specifica- 
tions and recommended connections. 







Table 5. 


PGA Pin Name with Package Location (Signal Order) 






Address Bus 


Data Bus 


Bus Control 


Processor Control 


I/O 


Name . 


.Location 


Name . 


. Location 


Name . . 


Location 


Name . . . .Location 


Name . .Location 


A31 .. 


S15 


D31 . . . 


.....R03 


BE3... 


....S05 


RESET ....... A16 


DREQ3 ... 


..A07 


A30 . . . 


.....Q13 


D30 . . . 


.....Q05 


BE2... 


....S06 




DREQ2... 


. .B06 


A29 . . . 


R14 


D29 . . 


.... .S02 


BET... 


....S07 


FAIL......../.A02 


DREQ1 ... 


. . A06 


A28... 


.....Q14 


D28... 


.... .Q04 


BEO .... 


.,..R09 




DREQO . . . 


..B05 


A27 .. 


... ..S16 


D27 . . 


.V... R02 




STEST...... ..B02 




A26 ... 


.....R15 


D26 . . . 


.....Q03 


W/R . . 


....S10 




DACK3 . . . 


..A10 


A25 


.....S17 


D25 . , 


S01 




ONCE ...C03 


DACK2 . . . 


. .A09 


A24 . . . 


Q15 


D24 . . 


R01 


ADS ... 


....R06 




DACK1 . . . 


..A08 


A23 ... 


.....R16 


D23 . . . 


.Q02 




CKLIN ........ C1 3 


DACKO . . . 


. . B08 


A22... 


.....R17 


D22 . .' 


.....P03 


READY 


;...S03 


CLKMODE ....C14 




A21 ... 


.....Q16 


D21 ... 


.....Q01 


BTERM. 


....R04 


PCLK1 .....;. :Bi 4 


EOP/TC0 . 


..A11 


A20 .. 


.....P15 


D20 .. 


P02 




PCLK2 .''.-. .....B13 


EOP/TCT . 


..A12 


A19 .. 


.....P16 


D19 .. 


.....P01 


WAIT.. 


...S12 




EOP/TC2 . 


..A13 


A18... 


Q17 


.D.1.8.. 


.....N02 


BLAST 


. ...S08 


v ss 


EOP/TC3 . 


. .A14 


A17 .. 


.....P17 


D17.. 


.....N01 




Location 




A16.. 


.....N16 


D16... 


...... MOT- 


DT/R.. 


....S11 


C07, C08, C09, 
C10,C11,C12, 
F15.G03, G15, 
H03,H15,J03, 
J15,K03,K15, 
L03, L15, M03, 
M15,Q07,Q08, 
Q09,Q10,Q11 


XINT7 .... 


..C17 


A15 .. 


.".... N17 


D15 .. 


...... L01 


DEN •'..■ 


....S09 


XINT6 .... 


..C16 


A14... 


.....MT7 


D14 .. 


...... L02 




XINT5 .... 


..B17 


A13 .. 


L16 


D13 .. 


K01 


LOCK . 


.v,.S14 


XINT4 .... 


..€15 


A12 .. 


...... L1 7 


D12 .,'. 


. .....J01 




XINT3 .... 


..B16 


A11 .. 


K17 


D1 1 . . 


.....H01 


HOLD . 


,. ..R05 


XINT2 .... 


..A17 


A10 .. 


.-■ J17 


D10.. 


H02 


HOLDA 


... .S04 


v cc 


XINT1 :... 


..A15 


A9 ... 


.....H17 


D9.... 


... ..G01 


BREQ ... 


....R13 


Location 


XINTO .... 


..B15 


A8 . . , . 


G17 


D8 . . . 


F01 




B07, B09.B10, 
B11,B12,C06, 
E15, F03.F16, 
G02, H16.J02, 
J16,K02 > K16, M02, 
M16, N03, N15, 
Q06, R07, R08, 
R10.R11 




A7 . . . 


G16 


D7 ... 


.....E01 


D/C . . . 


....S13 


NMl 


..D15 


A6 ... 


...... F17 


D6 . . . 


F02 


DMA .. 


....R12 




A5 ... 


E17 


D5... 


D01 


SUP .. 


....Q12 




A4 ... 


E16 


D4 ... 


E02 






A3 ... 


D17 


D3 ... 


C01 


BOFF . 


....B01 


No Connect 




A2 ... 


D16 


D2... 


D02 




Location 






D1 ... 


. .... .C02 




A01 , A03, A04, A05, 
B03, B04, C04, C05, 
D03 






DO ... 


E03 
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Table 6. 


PGA Pin Name with Package Location (Pin 


Order) 






Address Bus 


Data Bus 


Bus Control 


Processor Control 


I/O 


Location . . Name 


Location 


. . Name 


Location . 


. Name 


Location . . 


. . Name 


Location . .Name 


A01 . 


NC 


C01 .... 


D3 


G01 


....D9 


M01 


....D16 


R01 . 


D24 


A02 . 


. FAIL 


C02 


D1 


G02 


•••Vcc 


M02 ..... 


•••.Vcc 


R02. 


D27 


A03. 


NC 


C03 . . . . 


..ONCE 


G03 


•••Vss 


M03 


....Vss 


R03. 


D31 


A04. 


NC 


C04 . . . . 


NC 


G15 


•••Vss 


M15 


•••.Vss 


R04. 


...BTERM 


A05. 


NC 


C05 . . . . 


NC 


G16 


....A7 


M16 


....v cc 


R05 . 


HOLD 


A06. 


....DREQ1 


C06 . . . . 


....Vcc 


G17 


....A8 


M17...... 


. ...A14 


R06 


ADS 


A07. 


....DREQ3 


C07 . . . . 


-...Vss 






R07. 


Vcc 


A08 . 


. . . . DACK1 


C08 . . . . 


..-•Vss 


H01 


...D11 


N01 


....D17 


R08. 


Vcc 


A09 . 


. ...DACK2 


C09 . . . . 


..••Vss 


H02 


...D10 


N02 


....D18 


R09. 


.BEO 


A10 . 


. ...DACK3 


C10 .... 


....Vss 


H03 


•••Vss 


N03 


....V CC 


R10. 


v cc 


A11 . 


.EOP/TCO 


C11 .... 


....Vss 


H15 


...V S S 


N15 


....Vcc 


R11 . 


Vcc 


A12. 


.EOP/TC1 


C12 .... 


....Vss 


H16 


...Vcc 


N16 


....A16 


R12 . 


DMA 


A13. 


.EOP/TC2 


C13.... 


.CLKIN 


H17 


....A9 


N17 


....A15 


R13. 


BREQ 


A14. 


.EOP/TC3 


C14..CLKMODE 






R14 . 


A29 


A15 . 


XINT1 


C15 .... 


..XINT4 


J01 


...D12 


P01 


....D19 


R15. 


A26 


A16 . 


....RESET 


C16 .... 


..XINT6 


J02 


...v C c 


P02...... 


. ...D20 


R16. 


A23 


A17 . 


XINT2 


C17.... 


..XINT7 


J03 


...v S s 


P03 


. ...D22 


R17. 


....... A22 






J15 ..... 


•••Vss 


P15 


. ...A20 




B01 . 


BOFF 


D01 .... 


. ....D5 


J16 


...Vcc 


P16 


. ...A19 


S01 . 


.D25 


B02 . 


....STEST 


D02 . . . . 


.....D2 


J17 


...A10 


P17 


....A17 


S02 . 


.D29 


B03. 


NC 


D03 . . . . 


....NC 






S03. 


....READY 


B04. 


NC 


D15.... 


....NMl 


K01 


...D13 


Q01 


. ...D21 


S04. 


....HOLDA 


B05. 


...DREQO 


D16 .... 


A2 


K02 


•••Vcc 


Q02 


. ...D23 


S05. 


BE3 


B06. 


...DREQ2 


D17 .... 


A3 


K03 


...v ss 


Q03 


. ...D26 


S06. 


BE2 


B07 . 


....... V CC 




K15 ..... 


•••Vss 


Q04 


. . . . D28 


S07. 


.BET 


B08 . 


....DACKO 


E01 .... 


D7 


K16 


• ••Vcc 


Q05 


....D30 


S08 . 


....BLAST 


B09 . 


Vcc 


E02 . . . . 


D4 


K17 ..... 


...A11 


Q06...... 


....Vcc 


S09 . 


DEN 


B10 . 


v cc 


E03 . . . . 


DO 




Q07 


.•••Vss 


STO . 


,W/R 


B11 . 


Vcc 


E15 .... 


....Vcc 


L01 


...D15 


Q08.. 


• ••.Vss 


S11. 


..,..DT/R 


B12 . 


v cc 


E16 .... 


A4 


L02 


...D14 


Q09 


....v ss 


S12. 


WAIT 


B13 . 


....PCLK2 


E17 .... 


A5 


L03 


...v S s 


Q10 


...-Vss 


S13. 


D/C 


B14 . 


....PCLK1 




L15 ..... 


•••Vss 


Q11 


....v SS 


S14 . 


LOCK 


B15 . 


XINTO 


F01 .... 


.'.'... D8 


L16 ..... 


...A13 


Q12 ..... 


...SUP 


S15 . 


A31 


B16 . 


.....XINT3 


F02 . . . . 


D6 


L17 


...A12 


013 


. ...A30 


S16 . 


A27 


B17 . 


XINT5 


F03 . . . . 


..Vcc 




Q14 


....A28 


S17 . 


A25 




F15 .... 


.•••Vss 




Q15 


....A24 






F16 .... 


....Vcc 




Q16 


. ...A21 






F17 .... 


A6 




Q17 


....A18 
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S 


R 


Q 


P 


N 


M 


L 


K 


J 


H 


G 


F 


E 


D 


C 


B 


A 






f 
































V 


1 


D25~ 


D24 


D21 


D19 


D17 


D16 


'D15 


D13 


D12 


D11 


D9. 


D8 


D7 


D5 


D3 


b6tf 


a > 


1 


2 


D29 V 


D27 


D23 


D20 


D18 


v cc 


D14 


v cc 


v cc 


D10 


v cc 


D6 


D4 


D2 


D1 


STEST 


FAIL 


2 


3 


READY 


D31 


D26 


D22 


v cc 


v ss 


v ss 


v ss 


v Is 


V W 


v ss 
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DO 


NC 
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NC 
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HOLDA 
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NC 


NC 


~NC 


4 


5 


BE3 


HOLD 
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6 
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6 


7 
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8 
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Figure 4a. 80960CA PGA Pinout (View from Top Side) 
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Figure 4b. 80960CA PGA Pinout (View from Bottom Side) 
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3.3.2 80960CA PQFP Pinout 

Tables 7 and 8 list the 80960CA pin names with 
package location. 



See Section 4.0, Electrical Specifications for 

specifications and recommended connections. 



Table 7. PQFP Pin Name with Package Location (Pin Order) 



Address Bus 


Data Bus 


Bus Control 


Processor Control 


I/O 


Name . 


. Location 


Name . 


. Location 


Name . . 


Location 


Name Location 


Name . . Location 


A31 .. 


153 


D31 .. 


......186 


BE3 ... 


176 


RESET 


.091 


DREQ3 . . . 


..060 


A30 .. 


.152 


D30 .. 


.....187 


BE2 ... 


175 




DREQ2 . . . 


..059 


A29 .. 


151 


D29 .. 


188 


BET... 


172 


FAIL 


.045 


DREQ1 ... 


..058 


A28 .. 


.145 


D28 .. 


189 


BE0 ... 


170 




DREQ0 . . . 


..057 


A27 .. 


144 


D27 .. 


191 




STEST 


.046 




A26 .. 


143 


D26 .. 


192 


W/R... 


164 




DACK3 ... 


..065 


A25 ... 


142 


D25 .. 


194 




ONCE 


.043 


DACK2 ... 


..064 


A24 .. 


.141 


D24 .. 


195 


ADS... 


178 




DACK1 ... 


..063 


A23 .. 


139 


D23 .. 


003 




CLKIN 


.087 


DACK0 . . . 


..062 


A22 .. 


138 


D22 .. 


004 


READY 


182 


CLKMODE 


.085 




A21 .. 


.137 


D21 .. 


005 


BTERM 


184 


PCLK1 


.078 


EOP/TC3 . 


..069 


A20 


.....136 


D20 .. 


006 




PCLK2 


.074 


EOP/TC2 . 


..068 


A19 . . 


.134 


D19 .. 


008 


WAIT . . 


162 




EOP/TcT. 


..067 


A18 .. 


133 


D18 .. 


009 


BLAST. 


.....169 


v S s 


EOP/TC0 . 


..066 


A17 .. 


.132 


D17 .. 


010 




Location 




A16 .. 


130 


D16 .. 


011 


DT/R . . 


163 


2,7,16,24,30,38, 

39, 49, 56, 70, 75, 

77,81,83,88,89, 

92,98,105,109,110, 

121,125,131,135, 

147,150,161,165, 

173,174,185,196 


XINT7 .... 


..107 


A15 .. 


129 


D15 .. 


013 


DEN... 


167 


XINT6 .... 


..106 


A14 .. 


......128 


D14 . . 


014 




XINT5 .... 


..102 


A13 ... 


......124 


D13 . . 


015 


LOCK . 


156 


XINT4 .... 


..101 


A12 


......123 


D12 .. 


017 




XINT3 .... 


..100 




A11 .. 


.122 


D11 .. 


018 


HOLD . 


.....181 


XINT2 .... 


..095 




A10 .. 


120 


D10 . . 


.....019 


HOLDA 


.....179 


v cc 


XINT1 .... 


. .094 


A9 ... 


119 


D9 ... 


.....021 


BREQ . 


155 


Location 


XINT0 .... 


..093 


A8 . . . 


118 


D8 ... 


022 




1,12,20,28, 

32,37,44,50, 

61,71,72,79, 

82,96,99,103, 

115,127, 140,148, 

154,168,171,180, 

190 




A7 ... 


117 


D7 ... 


023 


D/C ... 


159 


NMl 


..108 


A6 ... 


116 


D6 . . . 


025 


DMA .. 


160 




A5' . . . 


114 


D5 ... 


026 


SUP... 


158 




A4 ... 


113 


D4 ... 


027 






A3 '. . . 


112 


D3 ... 


033 


BOFF.. 


040 


No Connect 




A2 ... 


......111 


D2 ... 


034 




Location 






D1 ... 


035 




29,41,42,47, 
48,51,52,53, 
54, 55, 73, 76, 
80, 84, 86, 90, 97, 
104,126,146,149, 
166,177,183,193 


157, 






DO ... 


......036 
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Table 8. PQFP Pin Name with Package Location (Pin Order) 



Pin 


Signal 


1 


V C C 


2 


v S s 


3 


D23 


4 


D22 


5 


D21 


6 


D20 


7 


Vss 


8 


D19 


9 


D18 


10 


D17 


11 


D16 


12 


Vcc 


13 


D15 


14 


D14 


15 


D13 


16 


Vss 


17 


D12 


18 


D11 


19 


D10 


20 


Vcc 


21 


D9 


22 


D8 


23 


D7 


24 


Vss 


25 


D6 


26 


D5 


27 


D4 


28 


Vcc 


29 


NC 


30 


Vss 


31 


NC 


32 


Vcc 


33 


D3 


34 


D2 


35 


D1 


36 


DO 


37 


Vcc 


38 


Vss 


39 


Vss 


40 


BOFF 


41 


NC 


42 


NC 


43 


ONCE 


44 


Vcc 


45 


FAIL 


46 


STEST 


47 


NC 


48 


NC 


49 


Vss 



Pin 


Signal 


50 


Vcc 


51 


NC 


52 


NC 


53 


NC 


54 


NC 


55 


NC 


56 


Vss 


57 


DREQ0 


58 


DREQ1 


59 


DREQ2 


60 


DREQ3 


61 


Vcc 


62 


DACK0 


63 


DACK1 


64 


DACK2 


65 


DACK3 


66 


EOP0/TC0 


67 


EOP1/TC1 


68 


EOP2/TC2 


69 


EOP3/TC3 


70 


Vss 


71 


Vcc 


72 


Vcc 


73 


NC 


74 


PCLK2 


75 


Vss 


76 


NC 


77 


Vss 


78 


PCLK1 


79 


Vcc 


80 


NC 


81 


Vss 


82 


Vcc 


83 


Vss 


84 


NC 


85 


CLKMODE 


86 


NC 


87 


CLKIN 


88 


Vss 


89 


Vss 


90 


NC 


91 


RESET 


92 


Vss 


93 


XINT0 


94 


XINT1 


95 


XINT2 


96 


Vcc 


97 


NC 


98 


Vss 



Pin 


Signal 


99 


Vcc 


100 


XINT3 


101 


XINT4 


102 


XINT5 


103 


Vcc 


104 


NC 


105 


Vss 


106 


XINT6 


107 


XINT7 


108 


NMI 


109 


Vss 


110 


Vss 


111 


A2 


112 


A3 


113 


A4 


114 


A5 


115 


Vcc 


116 


A6 


117 


A7 


118 


A8 


119 


A9 


120 


A10 


121 


Vss 


122* 


A11 


123 


A12 


124 


A13 


125 


v ss 


126 


NC 


127 


Vcc 


128 


A14 


129 


A15 


130 


A16 


131 


Vss 


132 


A17 


133 


A18 


134 


A19 


135 


Vss 


136 


A20 


137 


A21 


138 


A22 


139 


A23 


140 


Vcc 


141 


A24 


142 


A25 


143 


A26 


144 


A27 


145 


A28 


146 


NC 


147 


Vss 



Pin 


Signal 


148 


Vcc 


149 


NC 


150 


Vss 


151 


A29 


152 


A30 


153 


A31 


154 


Vcc 


155 


BREQ 


156 


LOCK 


157 


NC 


158 


SUP 


159 


D/C 


160 


DMA 


161 


Vss 


162 


WAIT 


163 


DT/R 


164 


W/R 


165 


Vss 


166 


NC 


167 


DEN 


168 


Vcc 


169 


BLAST 


170 


BE0 


171 


Vcc 


172 


BE1 


173 


V SS 


174 


Vss 


175 


BE2 


176 


BE3 


177 


NC 


178 


ADS 


179 


HLDA 


180 


• Vcc 


181 


HOLD 


182 


READY 


183 


NC 


184 


BTERM 


185 


Vss 


186 


D31 


187 


D30 


188 


D29 


189 


D28 


190 


Vcc 


191 


D27 


192 


D26 


193 


NC 


194 


D25 


195 


D24 


196 


Vss 



3-183 



intel® 



80960CA-33,-25,-16 




Figure 4c. 80960CA PQFP Pinout (View from Top Side) 
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3.4. Mechanical Data 

3.4.1 CERAMIC PGA PACKAGE 





D * 








Si— 






01.65 
REF. 
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© © © 
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© © © 
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© © © 






® ® ® 




© © ® 






© © © 


f \ 


© © © 






© © © 


© © © 
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© © © 




© © © 




PIN C3-^ 


© © © 


V ^ 


© © ® 






s® © © 


® ® ® 
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® ® ® 
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\ 
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" 



T 

2.29 
1.52 
45° CHAMFER 
(INDEX CORNER) 



SEATING_ 
PLANE _ 



rREF. 



SWAGGED- 

PIN 

(4 PL) 



BASE_ 
PLANE 



SEATING 
PLANE - * 
0B (ALL PINS) 



SWAGGED 

PIN 

DETAIL 



Family: Ceramic Pin Grid Array Package 


Symbol 


Millimeters 


Inches 


Min 


Max 


Notes 


Min 


Max 


Notes 


A 


3.56 


4.57 




0.140 


0.180 




Ai 


0.64 


1.14 


SOLID LID 


0.025 


0.045 


SOLID LID 


A 2 


23 


0.30 


SOLID LID 


0.110 


0.140 


SOLID LID 


A 3 


1.14 


1.40 




0.045 


0.055 




B 


0.43 


0.51 




0.017 


0.020 




D 


44.07 


44.83 




1.735 


1.765 




Di 


40.51 


40.77 




1.595 


1.605 




©1 


2.29 


2.79 




0.090 


0.110 




L 


2.54 


3.30 




0.100 


0.130 




N 


168 




168 




Si 


1.52 


2.54 




0.060 


0.100 




ISSUE 


IWS REVX 7/15/88 





Figure 5. 168-Lead Ceramic PGA Package Dimensions 
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Table 9. Ceramic PGA Package Dimension Symbols 


Letter or 
Symbol 


Description of Dimensions 


A 


Distance from seating plane to highest point of body 


Ai 


Distance between seating plane and base plane (lid) 


A 2 


Distance from base plane to highest point of body 


A 3 


Distance from seating plane to bottom of body 


B 


Diameter of terminal lead pin 


D 


Largest overall package dimension of length 


Di 


A body length dimension, outer lead center to outer lead center 


e'1 


Linear spacing between true lead position centerlines 


L 


Distance from seating plane to end of lead 


^ 


Other body dimension, outer lead center to edge of body 



NOTES: 

1. Controlling dimension: millimeter. 

2. Dimension "ei" ("e") is non-cumulative. 

3. Seating plane (standoff) is defined by P.C. board hole size: 0.0415-0.0430 inch. 

4. Dimensions "B", "B-|" and "C" are nominal. 

5. Details of Pin 1 identifier are optional. 
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3.4.2 PLASTIC QUAD FLAT PACKAGE 



mm (inch) 




.20 (.008)® |cJA©-B©|D©[ 



Z&EE3 



BASE PLANE 
-*Jk*- Al 



fo|0.20 (.008)©lc|A(D-B©lD©1 



A — 

^SEATING PLANE 



Q|0.10 (.004)| 



Figure 6. Principal Dimensions and Datums 



mm (inch) 



-D2- 



?.25 (.010)® |C[A©-B©1d©1& 



.002 MM/MM (IN/IN) A-B 



E2 Els 




0.25 (.010)® [c1a©-b©1p©1a 



■002. MM/MM (IN/IN) } A-B | 



_L 



3.81 (.150) MAX TYP 



.SEE DETAIL M 



H h-1-91 (.075) MAX TYP 



-0-10.25 (.010)® 1C|A©-B©|D(S)1 
002 MM/MM (IN/IN)Td1 



0.25 (.010)® |C|A©-B©|D(D|A 
.002 MM/MM (IN/ IN) |d| 





Figure 7. Molded Details 








mm (inch) 


1.32 (.052) — 
1.22 (.048) 




0.90 (.0 


35) MIN. 
E 


2 


270727-58 


L_, 








♦ 




f 1 






1.32 (.052) U U \ 
1.22 (.048) 

0.90 (.035) MIN. * 

2.03 (.080) — 
1.93 (.07&) 


rY 


\ 


V 


\ 

2.03 (.080) 
1.93 (.07b) 











Figure 8. Detail M 
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-^0.635 <0.025)| 












mm (inch) 








^ — SEE DETAIL L 








} 










270727-56 



Figure 9. Terminal Details 



mm (inch) 




270727-57 


|0|0.13 (.005)®|C|A(D-B(D|D(D|A 




— 0.41 (.016) ( 1 I II 
!S 0.20 (.008) / 1 \ 

| ATO ( — LI. .11 


) S Mil Ji- 0,20 ( - 008) 

( Mil — FT ' 14 ( - 005> 


0.31 (.012) -H 
0.20 (.008) 


K ^j==. liiira 


1 1 -J u. r 


I3H0.20 (.008)®|C|A(D-B(D|D©|/8\ 8 DEG. 

DEG. 

Detail J Detail L 



Figure 10. Typical Lead 
Table 10. PQFP Package Dimension Symbols 



Symbol 


Description 


Min 


Max 


Min 


Max 


N 


Leadcount 


196 


196 


A 


Package Height 


0.160 


0.170 


4.06 


4.32 


A1 


Standoff 


0.020 


0.030 


0.51 


0.76 


D,E 


Terminal Dimension 


1.475 


1.485 


37.47 


37.72 


D1.E1 


Package Body 


1 .347 


1.353 


34.21 


34.37 


D2.E2 


Bumper Distance 


1.497 


1.503 


38.02 


38.18 


D3, E3 


Lead Dimension 


1.200 REF 


30.48 REF 


D4, E4 


Foot Radius Location 


1.423 


1.437 


36.14 


36.49 


L1 


Foot Length 


0.020 


0.030 


0.51 


0.76 


Dimension INCH 


mm 



NOTES: 

1. All dimensions and tolerances conform to ANSI Y14.5M-1 982. 

2. Datum plane -H- located at the mold parting line and coincident with the bottom of the lead where lead exits plastic body. 

3. Datums A-B and -D- to be determined where center leads exit plastic body at datum plane -H-. 

4. Controlling Dimension, Inch. 

5. Dimensions D1, D2, E1 and E2 are measured at the mold parting line. D1 and E1 do not include an allowable mold 
protrusion of 0.18 mm (0.007 in) per side. D2 and E2 do not include a total allowable mold protrusion of 0.18 mm (0.007 in) 
at maximum package size. 

6. Pin 1 identifier is located within one of the two zones indicated. 

7. Measured at datum plane -H-. 

8. Measured at seating plane datum -C-. 
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3.5. Package Thermal Specifications 

The 80960CA is specified for operation when Tc 
(the case temperature) is within the range of 0°C- 
100°C. Tc may be measured in any environment to 
determine whether the 80960CA is within specified 
operating range. The case temperature should be 
measured at the center of the top surface, opposite 
the pins. Refer to Figure 13. 

Ta (the ambient temperature) can be calculated 
from #ca (thermal resistance from case to ambient) 
with the following equation: 

Ta = T C - P*0 C A 



Table 1 1 shows the maximum Ta allowable (without 
exceeding Tc) at various airflows and operating fre- 
quencies (fpcLK)- 

Note that Ta is greatly improved by attaching fins or 
a heat sink to the package. P (the maximum power 
consumption) is calculated by using the typical Ice 
as tabulated in Section 4.4, DC Characteristics, 
and Vcc of 5V. 



Table 11. Maximum Ta at Various Airflows In °C (PGA Package Only) 







Airflow-ft/min (m/sec) 




f PCLK 
(MHz) 



(0) 


200 
(1-01) 


400 
(2.03) 


600 
(3.04) 


800 
(4.06) 


1000 
(5.07) 


T A 
with 
Heat Sink* 


33 
25 
16 


51 
61 
74 


66 
73 
82 


79 
83 
89 


81 
85 
90 


85 
88 
92 


87 
89 
93 


T A 

without 
Heat Sink 


33 
25 
16 


36 
49 
66 


47 
58 
72 


59 
67 
78 


66 
73 
82 


73 
78 
86 


75 
80 
87 




: 0.285" high unidirectional heat sink (Al alloy 6061, 50 mil fin width, 150 mil center-to-center fin spacing). 



PGA Thermal Resistance— °C/Watt 


Parameter 


Airflow— ft./min (m/sec) 



(0) 


200 
(1-01) 


400 
(2.03) 


600 
(3.07) 


800 
(4.06) 


1000 
(5.07) 


Junction-to-Case 

(Case Measured 

as shown in Figure 13) 


1.5 


1.5 


1.5 


1.5 


1.5 


1.5 


Case-to-Ambient 
(No Heatsink) 


17 


14 


11 


9 


7.1 


6.6 


6 Case-to-Ambient 
(with Unidirectional) 
Heatsink)* 


13 


9 


5.5 


5.0 


3.9 


3.4 



,J P'"/ ~\I I 

I 1 1 XXX 



XXI 

9 J cop 



NOTES: 

1. This table applies to 80960CA PGA plugged into socket or soldered directly 
into board. 

2. 0ja = 0JC + 0ca. 

3. 0j_cap = 4 ° c/w (approx.) 

0J-PIN = 4°C/W (inner pins) (approx.) 
0J-PIN = 8°C/W (outer pins) (approx.) 
* 0.285" high unidirectional heat sink (Al alloy 6061, 50 mil fin width, 150 mil 
center-to-center fin spacing). 



Figure 11. 80960CA PGA Package Thermal Characteristics 
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PQFP Thermal Resistance— °C/Watt 


' Parameter 


Airflow — ft./min (m/sec) 



(0) 


50 
(0.25) 


100 
(0.50) 


200 
(1.01) 


400 
(2.03) 


600 
(3.04) 


800 
(4.06) 


Junction-to-Case 

(Case Measured) 

as shown in Figure 1 3) 


5 


5 


5 


5 


5 


5 


5 


6 Case-to-Ambient 
(No Heatsink) 


19 


18 


17 


15 


12 


10 


9 



NOTES: 

1 . This table applies to 80960CA PQFP soldered directly into board. 
2- 0ja = 0JC +; 0CA- 
3. 0jl = 18°C/Watt 
0jb = 18°C/Watt 




Figure 12. 80960CA PQFP Package Thermal Characteristics 



. MEASURE PGA CASE TEMPERATURE 
AT CENTER OF TOP SURFACE 




68 -PIN PGA 



- MEASURE PQFP TEMPERATURE AT 
CENTER OF TOP SURFACE 




Pin 196 



Pin 1 



270727-62 



Figure 13. Measuring 80960CA PGA and PQFP Case Temperature 
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3.6 Stepping Register Information 

Upon Reset, Register GO contains die stepping in- 
formation. The following figure shows how GO is 
configured. The most significant byte contains an 
ASCII 0. The upper middle byte contains an ASCII C. 
The lower middle byte contains an ASCII A. The 
least significant byte contains the stepping number 
in ASCII. GO retains this information until it is written 
over by the user program. 

Table 1 2 contains a cross reference of the number 
in the least significant byte of register GO to the die 
stepping number. 



ASCII 
DECIMAL 












00 


43 


41 


Stepping Number 







C 


A 


Stepping Number 


MSB 






LSB 





Figure 14. Register GO 
Table 12. Die Stepping Cross Reference 



GO Least 
Significant Byte 


Die Stepping 


01 


B 


02 


C-1 


03 


C-2 


04 


D 



3.7 Suggested Sources for 80960CA 
Accessories 

The following are some suggested sources of ac- 
cessories for the 80960CA. They are not an en- 
dorsement of any kind, nor a warranty of the per- 
formance of any of the listed products and/or com- 
panies. 

Sockets 

1. 3M Textool Test and Interconnection Products 
Department 

P.O. Box 2963 

Austin, TX 78769-2963 

2. Augat, Inc. 
Interconnection Products Group 
33 Perry Avenue 

P.O. Box 779 
Attleboro, MA 02703 
(508) 222-2202 

3. Concept Manufacturing Inc. 
(Decoupling Sockets) 
43024 Christy Street 
Fremont, CA 94538 
(415)651-3804 

Heat Sinks/Fins 

1. Thermalloy, Inc. 

2021 West Valley View Lane 
Dallas, TX 75381-0839 
(214)243-4321 

2. E G & G Division 
60 Audubon Road 
Wakefield, MA 01880 
(617)245-5900 
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4.0 ELECTRICAL SPECIFICATIONS 



4.1 Absolute Maximum Ratings 



Parameter 


Maximum Rating 


Storage Temperature 
Case Temperature Under Bias 
Supply Voltage wrt. Vss 
Voltage on Other pins wrt Vss 


-65°Cto +150°C 
-65°Cto +110°C 

- 0.5V to + 6.5V 
-0.5V to V C c + 0.5V 



NOTICE: This is a production data sheet. The specifi- 
cations are subject to change without notice. 



* WARNING: Stressing the device beyond the " Absolute 
Maximum Ratings" may cause permanent damage. 
These are stress ratings only. Operation beyond the 
"Operating Conditions" is not recommended and ex- 
tended exposure beyond the "Operating Conditions" 
may affect device reliability. 



4.2. Operating Conditions 

Operating Conditions (80960CA-33, -25, -16) 



Symbol 


Parameter 


Min 


Max 


Units 


Notes 


v C c 


Supply Voltage 80960CA-33 

80960CA-25 
80960CA-16 


4.75 
4.50 
4.50 


5.25 
5.50 
5.50 


V 




f CLK2x 


Input Clock Frequency (2-x Mode) 80960CA-33 

80960CA-25 
80960CA-16 







66 
50 
32 


MHz 
MHz 
MHz 




f CLK1x 


Input Clock Frequency (1 -x Mode) 80960CA-33 

80960CA-25 
80960CA-16 


8 
8 
8 


33 
25 
16 


MHz 
MHz 
MHz 


(D 


T C 


Case Temperature Under Bias PGA Package 
80960CA-33, -25, -16 196-Pin PQFP 






100 
100 


°C 





NOTE: 

(1) When in the 1-x input clock mode, CLKIN is an input to an internal phase-locked loop and must maintain a minimum 
frequency of 8 MHz for proper processor operation. However, in the 1-x Mode, CLKIN may still be stopped when the 
processor either is in a reset condition or is reset. If CLKIN is stopped, the specified RESET low time must be provided once 
CLKIN restarts and has stabilized. 



4.3 Recommended Connections 

Power and ground connections must be made to 
multiple Vcc and Vss (GND) pins. Every 80960CA- 
based circuit board should include power (Vcc) and 
ground (Vss) planes for power distribution. Every 
Vcc P' n must De connected to the power plane, and 
every Vss P ,n must De connected to the ground 
plane. Pins identified as "N.C." must not be con- 
nected in the system. 

Liberal decoupling capacitance should be placed 
near the 80960CA. The processor can cause tran- 
sient power surges when its numerous output buff- 
ers transition, particularly when connected to large 
capacitive loads. 



Low inductance capacitors and interconnects are 
recommended for best high frequency electrical per- 
formance. Inductance can be reduced by shortening 
the board traces between the processor and decou- 
pling capacitors as much as possible. Capacitors 
specifically designed for PGA packages will offer the 
lowest possible inductance. 

For reliable operation, always connect unused in- 
puts to an approp riate s ignal level. In p articula r, any 
unused interrupt (XINT, NMI) or DMA (DREQ) input 
should be con nected to Vcc through a pull-up resis- 
tor, as should BTERM if not used. Pull-up resistors 
should be in th e range of 20 Kft for each pin tied 
high. If READY or HOLD are not used, the unused 
input should be connected to ground. N.C pins 
must always remain unconnected. Refer to the 
80960CA User's Manual for more information. 
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4.4. DC Specifications 

DC Characteristics 

(80960CA-33, -25, -1 6 under the conditions described in Section 4.2, Operating Conditions.) 



Symbol 


Parameter 


Min 


Max 


Units 


Notes 


V|L 


Input Low Voltage for all pins except RESET 


-0.3 


0.8 


V 




V| H 


Input High Voltage for all pins except RESET 


2.0 


Vcc + 0.3 


V 




Vol 


Output Low Voltage 




0.45 


V 


Iql = 5 mA 


V H 


Output High Voltage Ioh = ~~ 1 mA 
Ioh = -200|llA 


2.4 
V C C " 0.5 




V 
V 




V|LR 


Input Low Voltage for RESET 


-0.3 


1.5 


V 




V|HR 


Input High Voltage for RESET 


3.5 


Vcc + 0.3 


V 




Ilh 


Input Leakage Current for each pin except 
BTERM, ONCE, DREQ3:0, STEST, 




±15 


jllA 


ov<;V| N <;Vcc(1) 


EOP3:0/TC3:0, NMI, XINT7:0, 
READY, HOLD, BOFF, CLKMODE 


lLI2 


Input Leakage Current for: 
BTERM, ONCE, DREQ3:0, STEST, 





-300 


jllA 


V| N = 0.45V (2) 


EOP3:0/TC3:0, NMI, XINT7:0, BOFF 


>LI3 


Input Leakage Current for: 
READY, HOLD, CLKMODE 





500 


julA 


V| N = 2.4V (3) 


«LO 


Output Leakage Current 




±15 


JLlA 


0.45V ^V UT^ V C c 


Jcc 


Supply Current (80960CA-33) 
>CC Max 
IccTyp 




900 
750 


mA 


(4) 
(5) 


ice 


Supply Current (80960CA-25) 
Ice Max 
IccTyp 




750 
600 


mA 


(4) 
(5) 


'cc 


Supply Current (80960CA-16) 
Ice Max 
IccTyp 




550 
400 


mA 


(4) 
(5) 


'once 


ONCE-mode Supply Current 




100 


mA 




C|N 


Input Capacitance for: 
CLKIN, RESET, ONCE, 
READY, HOLD, DREQ3:0, BOFF 
XINT7:0, NMI, BTERM, CLKMODE 





12 


PF 


Fc = 1 MHz 


COUT 


Output Capacitance of each output pin 




12 


PF 


F c = 1 MHz, (6) 


C|/0 


I/O Pin Capacitance 




12 


PF 


F c = 1 MHz 




NOTES: 

(1) No Pull-up or pull-down. 

(2) These pins have internal pullup resistors. 

(3) These pins have internal pulldown resistors. 

(4) Measured at worst case frequency, Vcc an d temperature, with device operating and outputs loaded to the test conditions 
described in Section 4.5.1, AC Test Conditions. 

(5) Ice Typical is not tested. 

(6) Output Capacitance is the capacitive load of a floa ting ou tput. 

(7) CLKMODE pin has a pull down resistor only when ONCE pin is deasserted. 
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4.5 AC Specification 

AC Characteristics — 80960CA-33 
(80960CA-33 only, under the conditions described in Section 4.2, Operating Conditions and Section 4.5.1, 
AC Test Conditions.) 



Symbol 


Parameter 


Min 


Max 


Units 


Notes 


INPUT CLOCKOO) 


T F 


CLKIN Frequency 





66 


MHz 


(D 


T C 


CLKIN Period In One-X Mode (f C |_Kix) 
In Two-X Mode (fcLK2x) 


30.3 
15.15 


125 

00 


ns 
ns 


(1,12) 
(D 


Tcs 


CLKIN Period Stability In One-X Mode (fci_Kix) 




±0.1% 


A 


(1,13) 


TCH 


CLKIN High Time In One-X Mode (f C |_Klx) 
In Two-X Mode (fcLK2x) 


6 
6 


62.5 

00 


ns 
ns 


(1,12) 
(D 


TCL 


CLKIN Low Time in One-X Mode (fci_Kix) 
In Two-X Mode (fcLK2x) 


6 
6 


62.5 

00 


ns 
ns 


(1,12) 
(D 


T"CR 


CLKIN Rise Time 





6 


ns 


(D 


TCF 


CLKIN Fall Time 





6 


ns 


(D 


OUTPUT CLOCKSO) 


TCP 


CLKIN to PCLK2:1 Delay In One-X Mode (f C LKix) 
In Two-X Mode (fci_K2x) 


-2 ; 

2 


2 
25 


ns 
ns 


(1,3,13,14) 
0,3) 


T 


PCLK2:1 Period In One-X Mode (fci_Kix) 
In Two-X Mode (fci_K2x) 


T C 
2T C 


ns 
ns 


(1,13) 
(1,3) 


Tph 


PCLK2:1 High Time 


(T/2) - 2 


T/2 


ns 


(1,13) 


T PL 


PCLK2-1 Low Time 


(T/2) - 2 


T/2 


ns 


(1,13) 


Tpr 


PCLK2:1 Rise Time 


1 


■•■'■4 


ns 


(1,3) 


Tp F 


PCLK2:1 Fall Time 


1 


4 


ns 


(1,3) 


SYNCHRONOUS OUTPUTS*™) 


T(DV 
TOH 


Output Valid Delay, Output Hold 
T0V1,T H1 A31:2 
ToV2.ToH2 BE3:0 
ToV3.ToH3 ADS 
ToV4.T H4 W/R 
ToV5.T H5 D/C, SUP, DMA 
T0V6. T H6 BLAST, WAIT 
ToV7.T H7 DEN 
T0V8.T0H8 HOLDA, BREQ 
ToV9> ToH9 LOCK 


3 
3 
6 
3 
4 
5 
3 
4 
4 
3 
3 
T/2 + 3 
2 


14 
16 
18 
18 
16 
16 
■ 16 x 
16 
16 
18 
16 
T/2+ 14 
14 


ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 


(6,11) 
(6,11) 


TOV10. ToH10 DACK3:0, EOP3:0/TC3:0 
T0VII.T0HH D31:0 
TOV12.TOH12 DT/R 
T OV13. ToH13 FAIL 


T 0F 


Output Float for all outputs 


3 


22 


ns 


(6) 


SYNCHRONOUS INPUTS*™) 


T|S 


Input Setup 

T,si D31:0 

T,s2 BOFF 


3 
17 
7 
7 




ns 
ns 
ns 
ns 


(1,11) 
(1,11) 
(1,11) 
(1,11) 


T| S3 BTERM/READY 
T| S4 HOLD 


T| H 


Input Hold 

T,hi D31:0 

T| H2 . BOFF 


5 
5 
2 
3 




ns 
ns 
ns 
ns 


(1,11) 
(1,11) 
(1,11) 
(1,11) 


T| H3 BTERM/READY 
T, H4 HOLD 
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AC Characteristics — 80960CA-33 
80960CA-33 only, under the conditions described in Section 4.2, Operating Conditions and Section 4.5.1, 
AC Test Conditions.) (Continued) 



Symbol 


Parameter 


Min 


Max 


Units 


Notes 


RELATIVE OUTPUT TIMINGSO.7) 


Tavsm 


A31:2 Valid to ADS Rising 


T-4 


T + 4 


ns 




T AVSH2 


BE3:0,W/R,SUP, D/C, 

DMA, DACK3:0 Valid to ADS Rising 


T- 6 


T + 6 


ns 




Taveu 


A31:2 Valid to DEN Falling 


T-4 


T + 4 


ns 




T AVEL2 


BE3:0, W/R, SUP, INST, 

DMA, DACK3:0 Valid to DEN Falling 


T- 6 


T + 6 


ns 




Tnlqv 


WAIT Falling to Output Data Valid 


±4 


ns 




Tdvnh 


Output Data Valid to WAIT Rising 


N*T - 4 


N*T + 4 


ns 


(4) 


Tnlnh 


WAIT Falling to WAIT Rising 


N*T ± 4 


ns 


(4) 


Tnhqx 


Output Data Hold after WAIT Rising 


(N + 1)*T-4 


(N + 1) *T + 4 


ns 


(5) 


Tehtv 


DT/R Hold after DEN High 


T/2 - 4 


oo 


ns 


(6) 


Ttvel 


DT/R Valid to DEN Falling 


T/2 - 4 


T/2 + 4 


ns 


(7) 


RELATIVE INPUT TIMINGS(7) 


T|S5 


RESET Input Setup 


6 




ns 


(15) 


T|H5 


RESET Input Hold 


5 




ns 


(15) 


T|S6 


DREQ3:0 Input Setup 


12 




ns 


(8) 


T IH6 


DREQ3:0 Input Hold 


7 




ns 


(8) 


T|S7 


XINT7:0,NMI Input Setup 


7 




ns 


(15) 


T|H7 


XINT7:0, NMI Input Hold 


3 




ns 


(15) 




NOTES: 

(1) See Section 4.5.2, AC Timing Waveforms for waveforms and definitions. 

(2) See Figure 22 for capacitive derating information for output delays and hold times. 

(3) See Figure 23 for capacitive derating information for rise and fall times. 

(4) Where N is the number of Nrad. Nrdd. n wad- or Nyy pp wait states that are programmed in the Bus Controller Region 
Table. When there are no wait states in an a ccess, W AIT never goes active. 

(5) N = Number of wait states inserted with READY. 

(6) Output Data and/or DT/R may be driven indefinitely following a cycle if there is no subsequent bus activity. 

(7) See Notes 1 , 2 and 3. 

(8) Since asynchronous inputs are synchronized internally by the 80960CA they have no required setup or hold times in 
order to be recognized and for proper operation. However, in order to guarantee recognition of the input at a particular rising 
edge of PCLK2:1 the setup times shown must be met. Asynchronous inputs must be active for at least two consecutive 
PCLK2:1 rising edges to be seen by the processor. 

(9) These specifications are guaranteed by the processor. 

(10) These specifications must be met by the system for proper operation of the processor. 

(11) This timing is dependent upon the loading of PCLK2:1. Use the derating curves of Section 4.5.3 to adjust the timing for 
PCLK2:1 loading. 

(12) In the One-x input clock mode the maximum input clock period is limited to 125 ns while the processor is operating. 
When the processor is in reset, the input clock may stop even in One-x mode. 

(13) When in the One-x input clock mode, these specifications assume a stable input clock with a period variation of less 
than ±0.1% between adjacent cycles. 

(14) This parameter is not tested. 

(15) Since asynchronous inputs are synchronized internally by the 80960CA, they have no required setup or hold times in 
order to be recognized and for proper operation. However, in order to guarantee recognition of the input at a particular 
falling edge of PCLK2:1 the setup times shown must be met. Asynchronous inputs must be active for at least two consecu- 
tive PCLK2:1 falling edges to be seen by the processor. 
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AC Characteristics — 80960CA-25 
(80960CA-25 only, under the conditions described in Section 4.2, Operating Conditions and Section 4.5.1, 
AC Test Conditions.) 



Symbol 


Parameter 


Min 


Max 


Units 


Notes 


INPUT CLOCK00) 


t f 


CLKIN Frequency 





50 


MHz 


0) 


Tc 


CLKIN Period In One-X Mode (f C LKix) 
In Two-X Mode (fcLK2x) 


40 
20 


125 

.00 


ns 
ns 


(1,12) 
(1) 


Tcs 


CLKIN Period Stability In One-X Mode (f C LKix) 




±0.1% 


A 


(1,13) 


TCH 


CLKIN High Time In One-X Mode (f C i_Kix) 
In Two-X Mode (fcLK2x) 


8 
8 


62.5 

00 


ns 
ns 


(1,12) 
(D 


T CL 


CLKIN Low Time In One-X Mode (fci_Kix) 
In Two-X Mode (fcLK2x) 


8 
8 


62.5 

00 


ns 
ns 


(1,12) 
(D 


TCR 


CLKIN Rise Time 





6 


ns 


(D 


TCF 


CLKIN Fall Time 





6 


ns 


(1) 


OUTPUT CLOCKSO) 


TCP 


CLKIN to PCLK2:1 Delay In One-X Mode (f C i_Kix) 
In Two-X Mode (fcLK2x) 


-2 
2 


2 
25 


ns 
ns 


(1,3,13,14) 
(1,3) 


T 


PCLK2:1 Period In One-X Mode (f C i_K1x) 
In Two-X Mode (fcLK2x) 


T C 
2T C 


ns 
ns 


(1,13) 
(1,3) 


Tph 


PCLK2:1 High Time 


(T/2) - 3 


T/2 


ns 


(1,13) 


Trl 


PCLK2:1 Low Time 


(T/2) - 3 


T/2 


ns 


(1,13) 


TpR 


PCLK2:1 Rise Time 


1 


■'..:.. 4 ■■ 


ns 


(1,3) 


TPF 


PCLK2:1 Fall Time 


1 


4 


ns 


(1,3) 


SYNCHRONOUS OUTPUTS*™) 


Tov 
TOH 


Output Valid Delay, Output Hold 
■Tovi.Tom A31:2 
ToV2.T H2 BE3:0 
T OV3> T OH3 ADS 
T0V4.T H4 W/R 
ToV5.T H5 D/C,SUP,DMA 
ToV6,T H6 BLAST, WAIT 
ToV7.T H7 DEN 
T0V8. T H8 HOLDA, BREQ 
ToV9> T"oH9 LOCK 


3 
3 
6 
3 
4 
5 
3 
4 
4 
4 
3 
T/2 + 3 
2 


16 
1.8 
20 
20 
18 
18 
18 
18 
18 
20 
18 
T/2 + 16 
16 


ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 


(6,11) 
(6,11) 


T0VIO.T0HIO DACK3:0, EOP3:0/TC3:0 
Tovii.ToH1l D31; 
Tovi2.ToH12 DT/R 
TOV13.TOH13 FAIL 


TOF 


Output Float for all outputs 


3 


22 


ns 


(6) 


SYNCHRONOUS INPUTS*™) 


TlS 


Input Setup 

T,si D31:0 

T| S2 BOFF 


5 

19 
9 
9 




ns 
ns 
ns 
ns 


(1,11) 
(1,11) 
(1,11) 
(1,11) 


T|S3 BTERM/READY 
T| S4 HOLD 


T,H 


Input Hold 

T, m D31:0 

T, H2 BOFF 


5 

7 
2 
5 




ns 
ns 
ns 
ns 


(1,11) 
(1,11) 
(1,11) 
(1,11) 


T| H3 BTERM/READY 
T, H4 HOLD 
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AC Characteristics — 80960CA-25 
(80960CA-25 only, under the conditions described in Section 4.2, Operating Conditions and Section 4.5.1, 
AC Test Conditions.) (Continued) 



Symbol 


Parameter 


Min 


Max 


Units 


Notes 


RELATIVE OUTPUT TIMINGS(9>7) 


Tavshi 


A31:2 Valid to ADS Rising 


T - 4 


T + 4 


ns 




T AVSH2 


BE3:0,W/R,SUP, D/C, 

DMA, DACK3.-0 Valid to ADS Rising 


T -6 


T + 6 


ns 




Taveli 


A31:2 Valid to DEN Falling 


T -4 


T + 4 


ns 




TAVEL2 


BE3:0, W/R, SUP, INST, 

DMA, DACK3:0 Valid to DEN Falling 


T - 6 


T + 6 


ns 




Tnlqv 


WAIT Falling to Output Data Valid 


±4 


ns 




Tdvnh 


Output Data Valid to WAIT Rising 


NT -4 


NT + 4 


ns 


(4) 


T NLNH 


WAIT Falling to WAIT Rising 


N*T± 4 


ns 


(4) 


Tnhqx 


Output Data Hold after WAIT Rising 


(N + 1)*T -4 


(N + 1)*T + 4 


ns 


(5) 


Tehtv 


DT/R Hold after DEN High 


T/2 -4 


oo 


ns 


(6) 


Ttvel 


DT/R Valid to DEN Falling 


T/2 -4 


T/2 + 4 


ns 


(7) 


RELATIVE INPUT T»MINGS(7) 


T|S5 


RESET Input Setup 


8 




ns 


(15) 


T|H5 


RESET Input Hold 


7 




ns 


(15) 


T|S6 


DREQ3:0 Input Setup 


14 




ns 


(8) 


T|H6 


DREQ3:0 Input Hold 


9 




ns 


(8) 


T|S7 


XINT7:0, NMI Input Setup 


9 




ns 


(15) 


T|H7 


XINT7:0, NMI Input Hold 


5 




ns 


(15) 




NOTES: 

(1) See Section 4.5.2, AC Timing Waveforms for waveforms and definitions. 

(2) See Figure 22 for capacitive derating information for output delays and hold times. 

(3) See Figure 23 for capacitive derating information for rise and fall times. 

(4) Where N is the number of Nrad. Nrdd. n wad. or Nyy pp wait states that are programmed in the Bus Controller Region 
Table. When there are no wait states in an a ccess, W AIT never goes active. 

(5) N = Number of wait states inserted with READY. 

(6) Output Data and/or DT/R may be driven indefinitely following a cycle if there is no subsequent bus activity. 

(7) See Notes 1 , 2 and 3. 

(8) Since asynchronous inputs are synchronized internally by the 80960CA they have no required setup or hold times in 
order to be recognized and for proper operation. However, in order to guarantee recognition of the input at a particular rising 
edge of PCLK2:1 the setup times shown must be met. Asynchronous inputs must be active for at least two consecutive 
PCLK2:1 rising edges to be seen by the processor. 

(9) These specifications are guaranteed by the processor. 

(1 0) These specifications must be met by the system for proper operation of the processor. 

(11) This timing is dependent upon the loading of PCLK2-.1. Use the derating curves of Section 4.5.3 to adjust the timing for 
PCLK2:1 loading. 

(12) In the One-x input clock mode the maximum input clock period is limited to 125 ns while the processor is operating. 
When the processor is in reset, the input clock may stop even in One-x mode. 

(13) When in the One-x input clock mode, these specifications assume a stable input clock with a period variation of less 
than ±0.1% between adjacent cycles. 

(14) This parameter is not tested. 

(15) Since asynchronous inputs are synchronized internally by the 80960CA, they have no required setup or hold times in 
order to be recognized and for proper operation. However, in order to guarantee recognition of the input at a particular 
falling edge of PCLK2:1 the setup times shown must be met. Asynchronous inputs must be active for at least two consecu- 
tive PCLK2:1 falling edges to be seen by the processor. 
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AC Characteristics — 80960CA-1 6 

(80960CA-16 only, under the conditions described in Section 4.2, Operating Conditions and Section 4.5.1, 
AC Test Conditions.) (Continued) 



Symbol 


Parameter 


Min 


Max 


Units 


Notes 


INPUT CLOCK00) 


t f 


CLKIN Frequency 





32 


MHz 


CD 


T C 


CLKIN Period In One-X Mode (f C i_Kix) 
In Two-X Mode (fci_K2x) 


62.5 
31.25 


125 

00 


ns 
ns 


(1,12) 
(D 


T"CS 


CLKIN Period Stability In One-X Mode (f C LKix) 




±0.1% 


A 


(1,13) 


TCH 


CLKIN High Time In One-X Mode (f C i_Klx) 
In Two-X Mode (fcLK2x) 


10 
10 


62.5 
00 


ns 
ns 


(1,12) 
(D 


TCL 


CLKIN Low Time In One-X Mode (fcLKix) 
In Two-X Mode (fcLK2x) 


10 
10 


62.5 

00 


ns 
ns 


(1,12) 
0) 


f"CR 


CLKIN Rise Time 





6 


ns 


(D 


T CF 


CLKIN Fall Time 





6 


ns 


(1) 


OUTPUT CLOCKS^) 


T CP 


CLKIN to PCLK2:1 Delay In One-X Mode (fcLKix) 
In Two-X Mode (fci_K2x) 


-2 
2 


2 
25 


ns 
ns 


(1,3,13,14) 
(1,3) 


T 


PCLK2:1 Period In One-X Mode (fcLKix) 
In Two-X Mode (fcLK2x) 


Tc 
2T C 


ns 
ns 


(1,13) 
(1,3) 


TpH 


PCLK2:1 High Time 


(T/2) - 4 


T/2 


ns 


(1,13) 


T PL 


PCLK2:1 Low Time 


(T/2) - 4 


T/2 


ns 


(1,13) 


T PR 


PCLK2:1 Rise Time 


1 


4 


ns 


(1,3) 


TPF 


PCLK2:1 Fall Time 


1 


4 


ns 


(1,3) 


SYNCHRONOUS OUTPUTS*™) 


TOV 

TOH 


Output Valid Delay, Output Hold 
T OV1.ToH1 A3 1:2 
ToV2.T H2 BE3:0 
T OV3> T OH3 ADS 
T V4.ToH4 W/R 
TOV5. T H5 D/C, SUP, DMA 
T V6. T H6 BLAST, WAIT 
ToV7.T H7 DEN 
Tqv8.Toh8 HOLDA, BREQ 
ToV9. ToH9 LOCK 


3 
3 
6 
3 
4 
5 
3 
4 
4 
4 
3 
T/2 + 3 
2 


18 
20 
22 
22 
20 
20 
20 
20 
20 
22 
20 
T/2 + 18 
18 


ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 


(6,11) 
(6,11) 


ToviO. Tomo DACK3:0, EOP3:0/TC3:0 
T0VH.T0HII D31:0 
T<DV12. ToH12 DT/R 
T"0V13»T0H13 F AIL 


TOF 


Output Float for alloutputs 


3 


22 


ns 


(6) 


SYNCHRONOUS INPUTS*™) 


Tis 


Input Setup 

T,si D31:0 

T,s2 BOFF 


5 
21 
9 
9 




ns 
ns 
ns 
ns 


(1,11) 
(1,11) 
(1,11) 
(1,11) 


T|s3 BTERM/READY 
T, S4 HOLD 


T| H 


Input Hold 

T,hi D31:0 

T, H2 BOFF 


5 
7 
2 
5 




ns 
ns 
ns 
ns 


(1,11) 
(1,11) 
(1,11) 
(1,11) 


T|H3 BTERM/READY 
T, H4 HOLD 
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AC Characteristics — 80960C A- 16 
(80960CA-16 only, under the conditions described in Section 4.2, Operating Conditions and Section 4.5.1, 
AC Test Conditions.) (Continued) 



Symbol 


Parameter 


Min 


Max 


Units 


Notes 


RELATIVE OUTPUT TIMINGS(V) 


Tavshi 


A31:2 Valid to ADS Rising 


T-4 


T + 4 


ns 




TAVSH2 


BE3:0, W/R, SUP, D/C, 

DMA, DACK3:0 Valid to ADS Rising 


T-6 


T + 6 


ns 




TaVEU 


A31:2 Valid to DEN Falling 


T-6 


T + 6 


ns 




TAVEL2 


BE3:0, W/R, SUP, INST, 

DMA, DACK3:0 Valid to DEN Falling 


T-6 


T + 6 


ns 




Tnlqv 


WAIT Falling to Output Data Valid 


±4 


ns 




TdVNH 


Output Data Valid to WAIT Rising 


N*T-4 


N*T + 4 


ns 


(4) 


Tnlnh 


WAIT Falling to WAIT Rising 


N*T ± 4 


ns 


(4) 


Tnhqx 


Output Data Hold after WAIT Rising 


(N + 1)*T-4 


(N + 1)*T + 4 


ns 


(5) 


Tehtv 


DT/R Hold after DEN High 


T/2 - 4 


oo 


ns 


(6) 


TtVEL 


DT/R Valid to DEN Falling 


T/2 -4 


T/2 + 4 


ns 


(7) 


RELATIVE INPUT TIMINGS*?) 


T|S5 


RESET Input Setup 


10 




ns 


(15) 


T|H5 


RESET Input Hold 


9 




ns 


(15) 


T IS6 


DREQ3:0 Input Setup 


16 




ns 


(8) 


T|H6 


DREQ3:0 Input Hold 


11 




ns 


(8) 


T|S7 


XINT7:0, NMI Input Setup 


9 




ns 


(15) 


T|H7 


XINT7:0, NMI Input Hold 


5 




ns 


(15) 




NOTES: 

(1) See Section 4.5.2, AC Timing Waveforms for waveforms and definitions. 

(2) See Figure 22 for capacitive derating information for output delays and hold times. 

(3) See Figure 23 for capacitive derating information for rise and fall times. 

(4) Where N is the number of Nrad. Nrdd. n wad» or Nyy pp wait states that are programmed in the Bus Controller Region 
Table. When there are no wait states in an access, WAIT never goes active. 

(5) N = Number of wait state inserted with READY. 

(6) Output Data and/or DT/R may be driven indefinitely following a cycle if there is no subsequent bus activity. 

(7) See Notes 1, 2 and 3. 

(8) Since asynchronous inputs are synchronized internally by the 80960CA they have no required setup or hold times in 
order to be recognized and for proper operation. However, in order to guarantee recognition of the input at a particular rising 
edge of PCLK2:1 the setup times shown must be met. Asynchronous inputs must be active for at least two consecutive 
PCLK2:1 rising edges to be seen by the processor. 

(9) These specifications are guaranteed by the processor. 

(10) These specifications must be met by the system for proper operation of the processor. 

(11) This timing is dependent upon the loading of PCLK2:1. Use the derating curves of Figure 22 to adjust the timing for 
PCLK2.-1 loading. 

(12) In the One-x input clock mode the maximum input clock period is limited to 125 ns while the processor is operating. 
When the processor is in reset, the input clock may stop even in One-x mode. 

(1 3) When in the One-x input clock mode, these specifications assume a stable input clock with a period variation of less 
than ±0.1% between adjacent cycles. 

(14) This parameter is not tested. 

(1 5) Since asynchronous inputs are synchronized internally by the 80960CA, they have no required setup or hold times in 
order to be recognized and for proper operation. However, in order to guarantee recognition of the input at a particular 
falling edge of PCLK2:1 the setup times shown must be met. Asynchronous inputs must be active for at least two consecu- 
tive PCLK2.-1 falling edges to be seen by the processor. 
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4.5.1. AC TEST CONDITIONS 



OUTPUT 
PIN O- 



c L 



V 



CL= 50pf for all signals 



270727-11 



The AC Specifications in Section 4.5 are tested with 
the 50 pf load shown in Figure 15. See Figure 22 to 
see how timings vary with load capacitance. 

Specifications are measured at the 1.5V crossing 
point, unless otherwise indicated. Input waveforms 
are assumed to have a rise-and-fall time of ^ 2 ns 
from 0.8V to 2.0V. See Section 4.5.2, AC Timing 
Waveforms, for AC spec definitions, test points, 
and illustrations. 



Figure 15. AC Test Load 
4.5.2. AC TIMING WAVEFORMS 
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Figure 16a. Input and Output Clocks Waveform 
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Figure 16b. CLKIN Waveform 
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Figure 17. Output Delay and Float Waveform 
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Figure 18a. Input Setup and Hold Waveform 

Ubv) (fan) — OUTPUT DELAY — The maximum output delay is referred to 
as the Output Valid Delay (T QV ). The minimum output delay is 
referred to as the Output Hold (T 0H ). 

(tJ) — OUTPUT FLOAT DELAY — The output float condition occurs 

when the maximum output current becomes less than I L0 in magnitude. 



(t.cVTii/S — INPUT SETUP AND HOLD — The input setup and hold requirements 
specify the sampling window during which synchronous inputs must be 
stable for correct processor operation. 
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RESET, NMI.XINT 7:0 ^.5V^( VALID j^J^V^^ 
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Figure 18b. RESET, NMI, XINT7:0 Input Setup and Hold Waveform 
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(Tov) (Toh) — OUTPUT DELAY — The maximum output delay is referred to 
as the Output Valid Delay (T QV ). The minimum output delay is 
referred to as the Output Hold (T 0H ). 



TOF 



— OUTPUT FLOAT DELAY — The output float condition occurs 

when the maximum output current becomes less than I LO in magnitude. 

— INPUT SETUP AND HOLD — The input setup and hold requirements 
specify the sampling window during which synchronous inputs must be 
stable for correct processor operation: 



Figure 19. Hold Acknowledge Timings 
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Figure 20. Bus Back-Off (BOFF) Timings 
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i \__i V__i \__i 



A31:2, BE3:0, W/R, LOCK, J 
SUP, D/C, DMA 
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Figure 21. Relative-timings Waveforms 
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NOTE: 










PCLK Load = 50 pF 











Figure 22. Output Delay or Hold vs Load Capacitance 
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Figure 23. Rise and Fall Time Derating at Highest Operating Temperature and Minimum Vqc 




WK ( MHz ) 

Ice— 'cc under test conditions 



Figure 24. Ice v s Frequency and Temperature 
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5.0 RESET, BACKOFF AND HOLD 
ACKNOWLEDGE 

The following table list s the co ndition of each proc- 
essor output pin while RESET is asserted (low). 



Table 13. Reset Conditions 


Pins 


State During Reset 
(HOLDA inactive) 1 


A3V.A2 


Floating 


D31:D0 


Floating 


BE3:0 


Driven high (Inactive) 


W/R 


Driven low (Read) 


ADS 


Driven high (Inactive) 


WAIT 


Driven high (Inactive) 




Driven low (Active) 


BLAST 


DT/R 


Driven low (Receive) 


DEN 


Driven high (Inactive) 


LOCK 


Driven high (Inactive) 


BREQ 


Driven low (Inactive) 


D/C 


Floating 


DMA 


Floating 


SDP 


Floating 


FAIL 


Driven low (Active) 


DACK3 


Driven high (Inactive) 


DACK2 


Driven high (Inactive) 




Driven high (Inactive) 


DACK1 




Driven high (Inactive) 


DACKO 


EOP/TC3 


Floating (set to input mode) 


EOP/TC2 


Floating (set to input mode) 


EOP/TCT 


Floating (set to input mode) 


EOP/TC0 


Floating (set to input mode) 



NOTE: 

(1) With regard to bus output pin state only, the Hold Ac- 
knowledge state takes preced ence over the reset state. Al- 
though asserting the RESET pin will internally reset the 
processor, the processor's bus output pins will not enter 
the reset state if it has granted Hold Acknowledge to a pre- 
vious HOLD request (HOLDA is active). Furthermore, the 
processor will grant new HOLD requests and enter the 
Hold Acknowledge state even while in reset. 
For example, if HOLDA is not active and the processor is 
in the reset state, then HOLD is asserted, the processor's 
bus pins will enter the Hold Acknowledge state and 
HOLDA will be granted. The processor will not be able to 
perform memory a ccesses until the HOLD request is re- 
moved, even if the RESET pin is brought high. This opera- 
tion is provided to simplify boot-up synchronization among 
multiple processors sharing the same bus. 



The following table lists the condition of each proc- 
essor output pin while HOLDA is asserted (low). 

Table 14. Hold Acknowledge 
and Backoff Conditions 



Pins 


State During HOLDA 


A31:A2 


Floating 


D31:D0 


Floating 


BE3:0 


Floating 


W/R 


Floating 


ADS 


Floating 


WAIT 


Floating 


BLAST 


Floating 


DT/R 


Floating 


DEN 


Floating 


LOCK 


Floating 


BREQ 


Driven (high or low) 


D/C 


Floating 


DMA 


Floating 


SUP 


Floating 


FAIL 


Driven high (Inactive) 


DACK3 


Driven high (Inactive) 




Driven high (Inactive) 


DACK2 




Driven high (Inactive) 


DACK1 




Driven high (Inactive) 


DACKO 


EOP/TC3 


Driven if output 


EOP/TC2 


Driven if output 


EOP/TCT 


Driven if output 


EOP/TCO 


Driven if output 
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RESET 1.5V 



PCLK2:1 
(Case 1) 



PCLK2: 1 
(Case 2) 




NOTE: 

Case 1 and Case 2 show two possible polarities of PCLK2:1. 
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Figure 28. Clock Synchronization in the 2x Clock Mode 
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Region Table Entry 
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Figure 29. Non-Burst, Non-Pipelined Accesses without wait states 
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Region Table Entry 



{ -o Y ' ' Y ™ Y ' Y Y ^ 

? rt By J° 1 u?m?h Nwdd Nvvad 
« Order 5 Width 


N,d a T N,dd T Nrad T JJUJ- \™?A Burs, 

Control 


bits 31-23 


bit 22 


M21 


bits 20-19 


bits 18 17 


bits 16-12 


b'ts 11 10 


bits 9-8 


bits 7 3 


bit? 


bit 1 


bitO 


OXOX X X 1 X 3 Oft Olsobled Disabled 

L 0...0^ x JL ^ xx 1 xx 1 xxxxx 1 01 1 xx 1 00011 J. 1 1 . A 



A31:2,BE3:0 




DEN 



SUP, DMA. 
D/C\LUCK 





\ 




; 




/ 








; 








/ 
\ 




■ 


\ i/ 


r 




/ 


\ 


• 








\i 














\ ■"> ! X 






\ 














i \ 

/t\— ... 
























vy 





270727-27 



Figure 30. Non-Burst, Non-Pipelined Read with wait states 
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Region Table Entry 
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Figure 31. Non-Burst, Non-Pipelined Write with wait states 
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Region Table Entry 
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Figure 32. Burst, Non-Pipelined Read without wait states, 32-bit bus 
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Region Table Entry 
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Figure 33. Burst, Non-Pipelined Read with wait states, 32-bit bus 
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Region Table Entry 
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Figure 34. Burst, Non-Pipelined Write without wait states, 32-bit bus 
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Region Table Entry 
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Figure 35. Burst, Non-Pipelined Write with wait states, 32-bit bus 
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Region Table Entry 
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Figure 36. Burst, Non-Pipelined Read with wait states, 16-bit bus 
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Region Table Entry 
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Figure 37. Burst, Non-Pipelined Read with wait states, 8-bit bus 
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Region Table Entry 
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Figure 38. Non-Burst, Pipelined Read without wait states, 32-bit bus 
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Region Table Entry 
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Figure 39. Non-Burst, Pipelined Read with wait states, 32-bit bus 
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Region Table Entry 
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Figure 40. Burst, Pipelined Read without wait states, 32-bit bus 
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Figure 41. Burst, Pipelined Read with wait states, 32-bit bus 
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Figure 42. Burst, Pipelined Read with wait states, 16-bit bus 
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Figure 43. Burst, Pipelined Read with wait states, 8-bit bus 



3-224 



int@L 



80960CA-33, -25, -16 





r Quad-Word Read Y Quad-Word Write ^ 

Nrad = 0. Nrdd = 0. Nxda = Nwad = 1 Nwdd = 0, Nxda = 
^ Ready Enabled > v Ready Enabled 


PCLK 


1 1 1 1 1 1 1 1 1 1 1 1 1 1 


ADS 
A31:4.5UP, 


i ii 




VJJ i i i ; i : i : 


DMA. INST, 
D/C.Bim 


Y Valid Y valid 


LOCK 


l I I 

"\ i : i 




i i i i i i ii i 
i i i i i ii i i 


W/R 


J i i i i i i 1 i i 




i i i 




i i i i i ii i i 


BLAST 


J \ : i. 

i i i 

-\i i : 


'\- 


/!!!!!!! !\ ! 




1 1 1 1 1 II 1 1 


DT/R 


_J\ ::::::: : 


DEN 


/ !\ : : 




J~T\ i i ; : i i i : 




i . i i 


mm 


J i i i i i i i i i 
i i i i ii i i i 


READY 


MMi/<m\M 


hmmimM'm\m\m\m'fmm 








ii i i ii i i i 


BTERM 


mAm'.W'.w 


Wlw, 


tato^^ 




i i i 
i i i 




iiii i i i i i 
i ii i lit i i 


A3.A2 


A °° A ° 1 A 10 A " A °° A " A 10 A 11 





i i i i i 


i i i ii i i i i 
i i ii i ii i i 




i ii i i 
i i i i i 
i i i i i 


iV_j/ i i i i i i i 

i ii ii i i i i 


D31:0 


i i i i i 
i i i i i 


v— k » ■ x » x ° 2 x ra 




i i i ii i i i i 
i i i i i i i i i 




Figure 44. Using External READY 
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NOTE: 

READY adds memory access time to data transfers, whether or not the bus access is a bu rst acce ss. BTERM i nterrupts 
a bus access, whether or not the bus access has more data transfers pending. Either the READY signal or the BTERM 
signal will terminate a bus access if the signal is asserted during the last (or only) data transfer of the bus access. 



Figure 45. Terminating a Burst with BTERM 
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Figure 46. BOFF Functional Timing 
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Figure 47. HOLD Functional Timing 
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NOTES: ___ 

1. Case 1: DREQ must deassert before DACK deasserts. Applications are Fly-by and some packing and unpacking 
modes, adj acent load-stores or store-loads, loads followed by loads, and stor es follo wed by stores. 

2. Case 2: DREQ must be deasserted by the second clock (rising edge) after DACK is driven high. Applications are non 
fly -by tra nsfers and adjacent load-stores or store-loads. 

3. DACK x is asserte d for the duration of a DMA bus request. The request may consist of multiple bus accesses (defined 
by ADS and BLAST. Refer to User's Manual for "access", "request" definition. 



Figure 48. DREQ and DACK Functional Timing 




NOTE: 

EOP has the same AC Timin g Re quirements as DREQ to prevent unwanted DMA req uests . 

EOP is NOT edge triggered. EOP must be held for a minimum of 2 clock cycles then EOP must be deasserted 

within 1 5 clock cycles. 

Figure 49. EOP Functional Timing 
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NOTE: 

Terminal Count becomes active during the last bus request of a buffer transfer. If the last LOAD/STORE bus request is 
executed as multiple bus accesses, the TC will be active for the entire bus request. Refer to the User's Manual for 
further information. 



Figure 50. Terminal Count Functional Timing 
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Figure 51. FAIL Functional Timing 
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Figure 52. A Summary of Aligned and Unaligned Transfers for Little Endian Regions 
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Figure 53. A Summary of Aligned and Unaligned Transfers for Little Endian Regions (Continued) 
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i960™ MC PROCESSOR 
PRODUCT OVERVIEW 



This chapter provides an overview of the architecture 
of the i960 MC processor. 

The i960 MC processor is the military-grade member of 
a new family of processors from Intel. This processor 
family is based on a new 32-bit architecture called the 
i960 architecture. The i960 architecture has been de- 
signed specifically to meet the needs of embedded appli- 
cations such as avionics, aerospace, weapons systems, 
robotics and instrumentation, where high reliability is 
critical. It represents a renewed commitment from Intel 
to provide reliable, high-performance processors and 
controllers for the embedded processor marketplace. 

The i960 architecture can best be characterized as a 
high-performance computing engine. It features high- 
speed instrumentation execution and ease of program- 
ming. It is also easily extensible, allowing processors 
and controllers based on this architecture to be conve- 
niently customized to meet the needs of specific pro- 
cessing and control appplications. 

Some of the important attributes of the i960 architec- 
ture include: 

° full 32-bit registers 

° high-speed, pipelined instruction execution 

° a convenient program execution environment with 
.32 general-purpose registers and a versatile set of 
special-function registers 

■® a highly optimized procedure call mechanism that 
features on-chip caching of local variables and pa- 
rameters 

° extensive facilities for handling interrupts and faults 

° extensive tracing facilities to support efficient pro- 
gram debugging and monitoring 

° register scoreboarding and write buffering to permit 
efficient operation with lower performance memory 
subsystems 

The i960 MC processor implements the i960 architec- 
ture, plus it offers several extensions to the architecture. 
Some of these extensions, such as on-chip support for 
floating-point arithmetic, virtual memory management 
and multitasking, are designed to enhance overall sys- 
tem performance. Several other extensions are designed 
to enhance system reliability and robustness. These ex- 
tensions include facilities for hardware enforced protec- 
tion of software modules and for creating fault tolerant 
systems through the use of redundant processors. 



The following sections describe those features of the 
i960 architecture that are provided to streamline code 
execution and simplify programming. The extensions to 
this architecture provided in the i960 MC processor are 
described at the end of the chapter. 



HIGH PERFORMANCE PROGRAM 
EXECUTION 

Much of the design of the i960 architecture has been 
aimed at maximizing the processor's computational 
and data processing speed through increased parallel- 
ism. The following paragraphs describe several of the 
mechanisms and techniques used to accomplish this 
goal, including: 

° an efficient load and store memory-access model 

° caching of code and procedural data 

° overlapped execution of instructions 

° many one or two clock-cycle instructions 



Load and Store Model 

One of the more important features of the i960 archi- 
tecture is that most of its operations are performed on 
operands in registers, rather than in memory. For ex- 
ample, all the arithmetic, logical, comparison, branch- 
ing and bit operations are performed with registers and 
literals. 

This feature provides two benefits. First, it increases 
program execution speed by minimizing the number of 
memory accesses required to execute a program. Sec- 
ond, it reduces memory latency encountered when us- 
ing slower, lower-cost memory parts. 

To support this concept, the architecture provides a 
generous supply of general-purpose registers. For each 
procedure, 32 registers are available (28 of which are 
available for general use). These registers are divided 
into two types: global and local. Both these types of 
registers can be used for general storage of operands. 
The only difference is that global registers retain their 
contents across procedure boundaries, whereas the 
processor allocates a new set of local registers each time 
a new procedure is called. 
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The architecture also provides a set of fast, versatile 
load and store instructions. These instructions allow 
burst transfers of 1, 2, 4, 8, 12 or 16 bytes of informa- 
tion between memory and the registers. 



On-Chip Caching of Code and Data 

To further reduce memory accesses, the architecture 
offers two mechanisms for caching code and data on 
chip: an instruction cache and multiple sets of local 
registers. The instruction cache allows prefetching of 
blocks of instruction from memory, which helps insure 
that the instruction execution pipeline is supplied with 
a steady stream of instructions. It also reduces the 
number of memory accesses required when performing 
iterative operations such as loops. (The size of the in- 
struction cache can vary. With the i960 MC processor, 
it is 512 bytes.) 

To optimize the architecture's procedure call mecha- 
nism, the processor provides multiple sets of local regis- 
ters. This allows the processor to perform most proce- 
dure calls without having to write the local registers out 
to the stack in memory. 

(The number of local-register sets provided depends on 
the processor implementation. The i960 MC processor 
provides four sets of local registers.) 



Overlapped Instruction Execution 

Another technique that the i960 architecture employs 
to enhance program execution speed is overlapping the 
execution of some instructions. This is accomplished 
through two mechanisms: register scoreboarding and 
branch prediction. 

Register scoreboarding permits instruction execution to 
continue while data is being fetched from memory. 
When a load instruction is executed, the processor sets 
one or more scoreboard bits to indicate the target regis- 
ters to be loaded. After the target registers are loaded, 
the scoreboard bits are cleared. While the target regis- 
ters are being loaded, the processor is allowed to exe- 
cute other instructions that do not use these registers. 
The processor uses the scoreboard bits to insure that 
target registers are not used until the loads are com- 
plete. (The checking of scoreboard bits is transparent to 
software.) The net result of using this technique is that 
code can often be optimized in such a way as to allow 
some instructions to be executed parallel. 



Single-Clock Instructions 

It is the intent of the i960 architecture that a processor 
be able to execute commonly used instructions such as 
move, add, subtract, logical operations, compare and 
branch in a minimum number of clock cycles (prefer- 
ably one clock cycle). The architecture supports this 
concept in several ways. For example, the load and 
store model described earlier in this chapter (with its 
concentration on register-to-register operations) allows 
simple operations to be performed without the over- 
head of memory-to-memory operations. 

Also, all the instructions in the i960 architecture are 
32 bits or 64 bits long and aligned on 32-bit boundaries. 
This feature allows instructions to be decoded irt one 
clock cycle. It also eliminates the need for an instruc- 
tion-alignment stage in the pipeline. 

The design of the i960 MC processor takes full advan- 
tage of these features of the architecture, resulting in 
more than 50 instructions that can be executed in a 
single clock-cycle. 



Efficient Interrupt Model 

The i960 architecture provides an efficient mechanism 
for servicing interrupts from external sources. To han- 
dle interrupts, the processor maintains an interrupt ta- 
ble of 248 interrupt vectors (240 of which are available 
for general use). When an interrupt is signaled, the 
processor uses a pointer from the interrupt table to per- 
form an implicit call to an interrupt handler procedure. 
In performing this call, the processor automatically 
saves the state of the processor prior to receiving the 
interrupt; performs the interrupt routine; and then re- 
stores the state of the processor. A separate interrupt 
stack is. also provided to segregate interrupt handling 
from application programs. 

The interrupt handling facilites also feature a method 
of prioritizing interrupts. Using this technique, the 
processor is able to store interrupts that are lower in 
priority than the task the processor is currently work- 
ing on in a pending interrupt section of the interrupt 
table. At certain defined times, the processor checks the 
pending interrupts and services them. 
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SIMPLIFIED PROGRAMMING 
ENVIRONMENT 

Partly as a side benefit of its streamlined execution en- 
vironment and partly by design, processors based on 
the i960 architecture are particularly easy to program. 
For example, the large number of general-purpose reg- 
isters allows relatively complex algorithms to be execut- 
ed with a minimum number of memory accesses. The 
following paragraphs describe some of the other fea- 
tures that simplify programming. 



Highly Efficient Procedure Call 
Mechanism 

The procedure call mechanism makes procedure calls 
and parameter passing between procedures simple and 
compact. Each time a call instruction is issued, the 
processor automatically saves the current set of local 
registers and allocates a new set of local registers for 
the called procedure. Likewise, on a return from a pro- 
cedure, the current set of local registers is deallocated 
and the local registers for the procedure being returned 
to are restored. On a procedure call, the program thus 
never has to explicitly save and restore those local vari- 
ables and parameters that are stored in local registers. 



Versatile Instruction Set and 
Addressing 

The selection of instructions and addressing modes also 
simplifies programming. The architecture offers a full 
set of load, store, move, arithmetic, comparison and 
branch instructions, with operations on both integer 
and ordinal data types. It also provides a complete set 
of Boolean and bit-field instructions, to simplify opera- 
tions on bits and bit strings. 

The addressing modes are efficient and straightforward, 
while at the same time providing the necessary indexing 
and scaling modes required to address complex arrays 
and record structures. 

The large 4-gigabyte address space provides ample 
room to store programs and data. The availability of 32 
addressing lines allows some address lines to be memo- 
ry-mapped to control hardware functions. 



Extensive Fault Handling Capability 

To aid in program development, the i960 architecture 
defines a wide selection of faults that the processor de- 
tects, including arithmetic faults, invalid operands, in- 



valid operations and machine faults. When a fault is 
detected, the processor makes an implicit call to a fault 
handler routine, using a mechanism similar to that de- 
scribed above for interrupts. The information collected 
for each fault allows program developers to quickly 
correct faulting code. It also allows automatic recovery 
from some faults. 



Debugging and Monitoring 

To support debugging systems, the i960 architecture 
provides a mechanism for monitoring processor activity 
by means of trace events. The processor can be config- 
ured to detect as many as seven different trace events, 
including branches, calls, supervisor calls, returns, pre- 
returns, breakpoints and the execution of any instruc- 
tion. When the processor detects a trace event, it sig- 
nals a trace fault and calls a fault handler. Intel pro- 
vides several tools that use this feature, including an in- 
circuit emulator (ICE™) device.. 



SUPPORT FOR ARCHITECTURAL 
EXTENSIONS 

The i960 architecture described earlier in this chapter 
provides a high-performance computing engine for use 
as the computational and data-processing core of em- 
bedded processor or controllers. The architecture also 
provides several features that enable processors based 
on this architecture to be easily customized to meet the 
needs of specific embedded applications, such as signal 
processing, array processing or graphics processing. 

The most important of these features is a set of 32 spe- 
cial-function registers. These registers provide a conve- 
nient interface to circuitry in the processor or to pins 
that can be connected to external hardware. They can 
be used to control timers, to perform operations on spe- 
cial data types or to perform I/O functions. 

The special-function registers are similar to the global 
registers. They can be addressed by all the register-ac- 
cess instructions. 



EXTENSIONS INCLUDED IN THE 
80960MC PROCESSOR 

The extensions to the i960 architecture included in the 
i960 MC processor are built on top of the processor's 
core computing engine. These extensions are aimed at 
improving the efficiency and reliability of embedded 
systems. 
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On-Chip Floating Point 

The i960 MC processor provides a complete implemen- 
tation of the IEEE standard for binary floating-point 
arithmetic (IEEE 754-185). This implementation in- 
cludes a full set of floating-point operations, including 
add, subtract, multiply, divide, trigonometric functions 
and logarithmic functions. These operations are per- 
formed on single precision (32-bit), double precision 
(64-bit) and extended precision (80-bit) real numbers. 

One of the benefits of this implementation is that the 
floating-point handling facilities are completely inte- 
grated into the normal instruction execution environ- 
ment. Single- and double-precision floating-point values 
are stored in the same registers as non-floating point 
values. Also, four 80-bit floating-point registers are pro- 
vided to hold extended-precision values. 



String and Decimal Operations 

The i960 MC processor provides several instructions 
for moving, filling and comparing byte strings in mem- 
ory. These instructions speed up string operations and 
reduce the amount of code required to handle strings. 



Protection 

The i960 MC processor offers two mechanisms for pro- 
tecting critical data structures or software modules. 
The first is the ability to use page rights bits to restrict 
access to individual pages. Page rights allow various 
levels of access to be assigned to a, page, ranging from 
no access to read only to read-write. 

The second protection mechanism is a user/supervisor 
protection model. This two-level protection model pro- 
vides hardware enforced protection of kernel proce- 
dures and data structures. When using this protection 
mechanism, priviledged procedures and data are placed 
in protected pages of memory. These pages can then be 
accessed only through a procedure table, which pro- 
vides a tightly controlled interface to kernel functions. 



Multitasking 

The i960 MC processor offers a variety of process man- 
agement facilities to support concurrent execution of 
multiple tasks. These facilities can be divided into two 
groups: process scheduling and interprocess communi- 
cations. 



The decimal instructions perform move, add with carry 
and subtract with carry operations on . binary-coded 
decimal (BCD) strings. 



Virtual-Memory Support 

Another of the i960 MC processor's important features 
is support for virtual-memory management. When us- 
ing the processor in virtual-memory mode, the proces- 
sor provides each process (or task) with an address 
space of up to 2 32 bytes. This address space is paged 
into physical memory in 4 Kbyte pages. On-chip mem- 
ory-management facilities handle virtual-to-physical 
address translation. A translation look-aside buffer 
(TLB) speeds address translation by storing virtual-to- 
physical address translations for frequently accessed 
parts of memory, such as the location of the page tables 
and the location of often used system data structures. 



The process scheduling facilities consist of a set of gen- 
eral-purpose data structures and instructions, which are 
designed to support several different multitasking 
schemes. For example, the processor provides a set of 
instructions that allow the kernel to explicitly dispatch 
a task (bind it to the processor) and to suspend a task 
(save the current state of a task so that another task can 
be bound to the processor). These instructions can be 
used within kernel procedures to schedule, dispatch 
and preempt multiple tasks. 

The processor also provides a unique feature called self 
dispatching. Here, the kernel schedules tasks by queu- 
ing them to a dispatch port. Thereafter, the processor 
handles the dispatching, preempting and rescheduling 
of the tasks automatically, independent of the kernel. 
When using this mechanism, tasks can be scheduled by 
priority, with up to 32 priority levels to choose from. 
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The processor's interprocess communication facilities 
include support for semaphores and communication 
ports. These facilities allow synchronization of interde- 
pendent tasks and asynchronous communication be- 
tween tasks. 



Multiprocessing 

The i960 MC processor provides several mechanisms 
designed to simplify the design of multiple-processor 
systems, allowing several processors to run in parallel, 
using shared memory resources. One of these mecha- 
nisms is the self-dispatching capability described above. 
Here, two or more processors can schedule and dis- 
patch processes from a single dispatch port, with each 
processor equally sharing the processing load. 

The processor also provides an inter-agent communica- 
tion (IAC) mechanism that allows processors to ex- 
change messages among themselves on the bus. This 
mechanism operates similarly to the interrupt mecha- 
nism, except that IAC messages are passed through 
dedicated sections of memory. The IAC mechanism 
can be used to preempt processes running on another 
processor, to manage interrupt handling or to initialize 
and synchronize several processors. 

A set of atomic instructions are also provided to syn- 
chronize memory accesses. Multiple processors can 
then access shared memory without inserting inaccura- 
cies and ambiguities into shared data structures. 



Fault Tolerance 

The i960 family of components supports fault-tolerant 
system design through the use of the M82965 Bus Ex- 
tension Unit component. The M82965 allows two proc- 
essors to be operated in tandem to form a self-checking 
module. The two M82965s check the outputs of two 
processors (a master and a checker) cycle-by-cycle. If 
the checking M82965 detects a difference between out- 
puts, it signals an error. A software recovery procedure 
can then be initiated. 

This fault detection mechanism supports several fault 
detection and recovery techniques, including self heal- 
ing, and continuous-operation (non-stop) systems. 



LOOK FOR MORE IN THE FUTURE 

The i960 architecture offers exceptional performance, 
plus a wealth of useful features to help in the design of 
efficient and reliable embedded systems. But equally 
important, it offers lots of room to grow. The i960 MC 
processor provides average instruction processing rates 
of 7.5 million instructions per second (7.5 MIPS) at 
20 MHz clock rate and 10 MIPS at a 25 MHz clock 
rateO). 

However, the i960 MC processor is only the beginning. 
With improvements in VLSI technology, future imple- 
mentations of the i960 architecture will offer even 
greater performance. They will also offer a variety of 
useful extensions to solve specific control and monitor- 
ing needs in the field of embedded applications. 




1. 1 MIP is equivalent to the performance of a Digital Equipment Corp. VAX 11/780. 
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80960MC 

EMBEDDED 32-BIT MICROPROCESSOR 

WITH INTEGRATED FLOATING-POINT UNIT 

AND MEMORY MANAGEMENT UNIT 

Military 
High-Performance Embedded 



■ On-Chip Memory Management Unit 

— 4 Gigabyte Virtual Address Space 
per Task 

— 4 Kbyte Pages with Supervisor/User 
Protection 

■ Built-in Interrupt Controller 
.— 32 Priority Levels 

— 248 Vectors 

— Supports M8259A 

— 3.4 /xs Latency 

■ Easy to Use, High Bandwidth 32-Bit Bus 

— 66.7 MBytes/s Burst 

— Up to 16-Bytes Transferred per Burst 

■ Multitasking and Multiprocessor 
Support 

— Automatic Task Dispatching 

— Prioritized Task Queues 

B Advanced Package Technology 
— 132 Lead Ceramic Pin Grid Array 

— 164 Lead Ceramic Quad Flatpack 

H Military Temperature Range 
55°Cto +125°C(T C ) 

The 80960MC is the enhanced military member of Intel's new 32-bit microprocessor family, the 960 series, 
which is designed especially for embedded applications. It is based on the family's high performance, com- 
mon core architecture, and includes a 512-byte instruction cache, a built-in interrupt controller, an integrated 
floating-point unit and a memory management unit. The 80960MC has a large register set, multiple parallel 
execution units, and a high-bandwidth, burst bus. Using advanced RISC technology, this high performance 
processor can respond to interrupts in under 3.4 jus and is capable of execution rates in excess of 9.4 million 
instructions per second.* The 80960MC is well-suited for a wide range of military and other high reliability 
applications, including avionics, airborne radar, navigation, and instrumentation. 

*Relative to Digital Equipment Corporation's VAX-1 1/780** at 1 MIPS 



Architecture 

— 25 MIPS Burst Execution at 25 MHz 

— 9.4 MIPS* Sustained Execution at 
25 MHz 

On-Chip Floating-Point Unit 

— Supports IEEE 754 Floating-Point 
Standard 

— Full Transcendental Support 

— Four 80-Bit Registers 

— 5.2 Million Whetstones/Second at 
25 MHz 

5 12- Byte On-Chip Instruction Cache 

— Direct Mapped 

— Parallel Load/Decode for Uncached 
Instructions 

Multiple Register Sets 

— Sixteen Global 32-Bit Registers 

— Sixteen Local 32-Bit Registers 

— Four Local Register Sets Stored 
On-Chip (Sixteen 32-Bit Registers 
per Set) 

— Register Scoreboarding 
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THE 960 SERBES 

The 80960MC is the enhanced military member of a 
new family of 32-bit microprocessors from Intel 
known as the 960 Series. This series was especially 
designed to serve the needs of embedded applica- 
tions. The embedded market includes applications 
as diverse as industrial automation, avionics, image 
processing, graphics, robotics, telecommunications, 
and automobiles. These types of applications re- 
quire high integration, low power consumption, quick 
interrupt response times, and high performance. 
Since time to market is critical, embedded micro- 
processors need to be easy to use in both hardware 
and software designs. 



All members of the 80960 series share a common 
core architecture which utilizes RISC technology so 
that, except for special functions, the family mem- 
bers are object code compatible. Each. new proces- 
sor in the series will add its own special set of func- 
tions to the. core to satisfy the needs of a specific 
application or range of applications in the embedded 
market. For example, future processors may include 
a DMA controller, a timer, or an A/D converter. 

The 80960MC includes an integrated Floating Point 
Unit (FPU), a Memory Management Unit (MMU), 
multitasking support, and multiprocessor support. 
There are also two commercial members of the fam- 
ily: the 80960KB processor with integrated FPU and 
the 80960KA without floating-point. 
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NOTES: 

1. Register g15 is reserved for stack management functions. 

2. Registers r0, r1 , and r2 are reserved for stack management functions. 




Figure 2. Register Set 
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KEY PERFORMANCE FEATURES 

The 80960MC's architecture is based on the most 
recent advances in RISC technology and is ground- 
ed in Intel's long experience in designing embedded 
controllers. Many features contribute to the 
80960MC's exceptional performance: 

1. Large Register Set. Having a large number of 
registers reduces the number of times that a proces- 
sor needs to access memory. Modern compilers can 
take advantage of this feature to optimize execution 
speed. For maximum flexibility, the 80960MC pro- 
vides thirty-two 32-bit registers (sixteen local and 
sixteen global) and four 80-bit floating-point global 
registers. (See Figure 2.) 

2. Fast Instruction Execution. Simple functions 
make up the bulk of instructions in most programs, 



so that execution speed can be greatly improved by 
ensuring that these core instructions execute in as 
short a time as possible. The most-frequently exe- 
cuted instructions such as register-register moves, 
add/subtract, logical operations, and shifts execute 
in one to two cycles (Table 1 contains a list of in- 
structions.) 

3. Load/Store Architecture. Like other processors 
based on RISC technology, the 80960MC has a 
Load/Store architecture, only the LOAD and STORE 
instructions reference memory; all other instructions 
operate on registers. This type of architecture simpli- 
fies instruction decoding and is used in combination 
with other techniques to increase parallelism. 
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Figure 3. Instruction Formats 
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Table 1. 80960MC Instruction Set 



Data Movement 


Arithmetic 


Floating 
Point 


Logical 


Load 
Store 
Move 

Load Address 

Load Physical 

Address 


Add 

Subtract 

Multiply 

Divide 

Remainder 

Modulo 

Shift 


Add 

Subtract 

Multiply 

Divide 

Remainder 

Scale 

Round 

Square Root 

Sine 

Cosine 

Tangent 

Arctangent 

Log 

Log Binary 

Log Natural 

Exponent 

Classify 

Copy Real 

Extended 
Compare 


And 

Not And 
And Not 
Or 

Exclusive Or 
Not Or 
Or Not 
Nor 

Exclusive Nor 
Not 
Nand 
Rotate 


Comparison 


Branch 


Bit and 
Bit Field 


String 


Compare 
Conditional 

Compare 
Compare and 

Increment 
Compare and 

Decrement 


Unconditional 

Branch 
Conditional Branch 
Compare and 

Branch 


Set Bit 
Clear Bit 
Not Bit 
Check Bit 
Alter Bit 
Scan for Bit 
Scan over Bit 
Extract 
Modify 


Move String 
Move Quick String 
Fill String 
Compare String 
Scan Byte for 
Equal 


Conversion 


Decimal 


Call/Return 


Process 
Management 


Convert Real to 

Integer 
Convert Integer to 

Real 


Move 

Add with Carry 

Subtract with Carry 


Call 

Call Extended 

Call System 

Return 

Branch and Link 


Schedule Process 
Saves Process 
Resume Process 
Load Process Time 
Modify Process 

Controls 
Wait 

Conditional Wait 
Signal 
Receive 
Conditional 

Receive 
Send 

Send Service 
Atomic Add 
Atomic Modify 


Fault 


Debug 


Miscellaneous 


Conditional Fault 
Synchronize Faults 


Modify Trace 

Controls 
Mark 
Force Mark 


Flush Local 

Registers 
Inspect Access 
Modify Arithmetic 

Controls 
Test Condition 

Code 
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4. Simple Instruction Formats. All instructions in 
the 80960MC are 32-bits long and must be aligned 
on word boundaries. This alignment makes it possk 
ble to eliminate the instruction-alignment stage in 
the pipeline. To simplify the instruction decoder fur- 
ther, there are only five instruction formats and each 
instruction uses only one format. (See Figure 3.) 

5. Overlapped Instruction Execution. A load oper- 
ation allows execution of subsequent instructions to 
continue before the data has been returned from 
memory, so that these instructions can overlap the 
load. The 80960MC manages this process transpar- 
ently to software through the use of a register score- 
board. Conditional instructions also make use of a 
scoreboard so that subsequent unrelated instruc- 
tions can be executed while the conditional instruc- 
tion is pending. 

6. Integer Execution Optimization. When the re- 
sult of an operation is used as an operand in a sub- 
sequent calculation, the value is sent immediately to 
its destination register. Yet at the same time, the 
value is put back on a bypass path to the ALU, 
thereby saving the time that otherwise would be re- 
quired to retrieve the value for the next operation. 

7. Bandwidth Optimizations. The 80960MC gets 
optimal use of its memory bus bandwidth because 
the bus is tuned for use with the cache: the line size 
of the instruction cache matches the maximum burst 
size for instruction fetches. The 80960MC automati- 
cally fetches four words in a burst and stores them 
directly in the cache. Due to the size of the cache 
and the fact that it is continually filled in anticipation 
of needed instructions in the program flow, the 
80960MC is exceptionally insensitive to memory 
wait states. In fact, each wait state causes only a 
7% degradation In system perfomance. The benefit 
is that the 80960MC will deliver outstanding per- 
formance even with a low cost memory system. 

8. Cache Bypass. If there is a cache miss, the proc- 
essor fetches the needed instruction, then sends it 
on to the instruction decoder at the same time it 
updates the cache. Thus, no extra time is taken to 
load and read the cache. 



Memory Space and Addressing Modes 

The 80960MC allows each task (process) to ad- 
dress a logical memory space of up to 4 G bytes. In 
turn, each task's address space is divided into four 
1 -Gbyte regions and each region can be mapped to 
physical addresses by zero, one, or two levels of 
page tables. The region with the highest addresses 
(Region 3) is common to all tasks. 



In keeping with RISC design principles, the number 
of addressing modes has been kept to a minimum 
but includes all those necessary to ensure efficient 
execution of high-level languages such as Ada, C, 
and Fortran. Table 2 lists the memory addressing 
modes. 



Data Types 

The 80960MC recognizes the following data types: 

Numeric: 

© 8-, 16-, 32- and 64-bit ordinals 
© 8-, 16, 32- and 64-bit integers 
o 32-, 64- and 80-bit real numbers 

Non-Numeric: 

o Bit 

o Bit Field 

© Triple-Word (96 bits) 

o Quad-Word (128 bits) 



Large Register Set 

The programming environment of the 80960MC in- 
cludes a large number of registers. In fact, 36 regis- 
ters are available at any time. The availability of this 
many registers greatly reduces the number of mem- 
ory accesses required to execute most programs, 
which leads to greater instruction processing speed. 

There are two types of general-purpose registers: 
local and global. The 20 global registers consist of 
sixteen 32-bit registers (GO through G15) and four 
80-bit registers (FPO through FP3). These registers 
perform the same function as the general-purpose 
registers provided in other popular microprocessors. 
The term global refers to the fact that these regis- 
ters retain their contents across procedure calls. 

The local registers, on the other hand, are proce- 
dure specific. For each procedure call, the 80960MC 
allocates 16 local registers (R0 through R15). Each 
local register is 32 bits wide. Any register can also 
be used for floating-point operations; the 80-bit float- 
ing-point registers are provided for extended preci- 
sion. 



Multiple Register Sets 

To further increase the efficiency of the register set, 
multiple sets of local registers are stored on-chip. 
This cache holds up to four local register frames, 
which means that up to three procedure calls can be 
made without having to access the procedure stack 
resident in memory. 
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Table 2. Memory Addressing Modes 



» 12-Bit Offset 


o 32-Bit Offset 


• Register-Indirect 


© Register + 12-Bit Offset 


• Register + 32-Bit Offset 


• Register + (Index-Register x Scale-Factor) 


• Register x Scale Factor + 32-Bit Displacement 


« Register + (Index-Register x Scale-Factor) + 32-Bit Displacement 


Scale-Factor is 1, 2, 4, 8 or 16 



Although programs may have procedure calls nest- 
ed many calls deep, a program typically oscillates 
back and forth between only two or three levels. As 
a result, with four stack frames in the cache, the 
probability of there being a free frame on the cache 
when a call is made is very high. In fact, runs of 
representative C-language programs show that 80% 
of the calls are handled without needing to access 
memory. 

If there are four or more active procedures and a 
new procedure is called, the processor moves the 
oldest set of local registers in the register cache to a 



procedure stack in memory to make room for a new 
set of registers. Global register G15 is used by the 
processor as the frame pointer (FP) for the proce- 
dure stack. 

Note that the global and floating-point registers are 
not exchanged on a procedure call, but retain their 
contents, making them available to all procedures 
for fast parameter passing. An illustration of the reg- 
ister cache is shown in Figure 4. 
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Figure 4. Multiple Register Sets Are Stored On-Chip 
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Instruction Cache 

To further reduce memory accesses, the 80960MC 
includes a 512-byte on-chip instruction cache. The 
instruction cache is based on the concept of locality 
of reference; that is, most programs are not usually 
executed in a steady stream but consist of many 
branches and loops that lead to jumping back and 
forth within the same small section of code. Thus, by 
maintaining a block of instructions in a cache, the 
number of memory references required to read in- 
structions into the processor can be greatly reduced. 

To load the instruction cache, instructions are 
fetched in 1 6-byte blocks, so that up to four instruc- 
tions can be fetched at one time. An efficient 
prefetch algorithm increases the probability that an 
instruction will already be in the cache when it is 
needed. 

Code for small loops will often fit entirely Within the 
cache, leading to a great increase in processing 
speed since further memory references might not be 
necessary until the program exits the loop. Similarly, 
when calling short procedures, the code for the call- 
ing procedure is likely to remain in the cache, so it 
will be there on the procedure's return. 

Register Scoreboarding 

The instruction decoder has been optimized in sev- 
eral ways. One of these optimizations is the ability to 
do instruction overlapping by means of register 
scoreboarding. 

Register scoreboarding occurs when a LOAD in- 
struction is executed to move a variable from memo- 
ry into a register. When the instruction is initiated, a 
scoreboard bit on the target register is set. When the 
register is actually loaded, the bit is reset. In be- 
tween, any reference to the register contents is ac- 
companied by a test of the scoreboard bit to insure 
that the load has completed before processing con- 
tinues. Since the processor does not have to wait for 
the LOAD to be completed, it can go on to execute 
additional instructions placed in between the LOAD 
instruction and the instruction that uses the register 
contents, as shown in the following example: 

LOAD R4, address 1 
LOAD R5, address 2 
Unrelated instruction 
Unrelated instruction 
ADD R4, R5, R6 

In essence, the two unrelated instructions between 
the LOAD and ADD instructions are executed for 



free (i.e., take no apparent time to execute) because 
they are executed while the register is being loaded. 
Up to three LOAD instructions can be pending at 
one time with three corresponding scoreboard bits 
set. By exploiting this feature, system programmers 
and compilers have a useful tool for optimizing exe- 
cution speed. 



Memory Management and Protection 

The 80960MC will be especially useful for multitask- 
ing applications that require software protection and 
a very large address space. To ensure the highest 
level of performance possible, the memory manage- 
ment unit and translation look-aside buffer (TLB) are 
contained on-chip. 

The 80960MG supports a conventional form of de- 
mand-paged virtual memory in which the address 
space is divided into 4 Kbyte pages. Studies have 
shown that a 4 Kbyte page is the optimum size for a 
broad range of applications. 

Each page table entry includes a 2-bit page rights 
field that specifies whether the page is a no-access, 
read-only, or read-write page. This field is interpret- 
ed differently depending on whether the current task 
(process) is executing in user or supervisor mode, as 
shown below: 



Rights 


User 


Supervisor 


00 


No Access 


Read-Only 


01 


No Access 


Read-Write 


10 


Read-Only 


Read-Write 


11 


Read-Write 


Read-Write 



Floating-Point Arithmetic 

In the 80960MC, floating-point arithmetic has been 
made an integral part of the architecture. Having the 
floating-point unit integrated on-chip provides two 
advantages. First, it improves the performance of 
the chip for floating-point applications, since no 
additional bus overhead is associated with floating- 
point calculations, thereby leaving more time for oth- 
er bus operations such as I/O. Second, the cost of 
using floating-point operations is reduced because a 
separate coprocessor chip is not required. 

The 80960MC floating-point (real number) data 
types include single-precision (32-bit), double-preci- 
sion (64-bit), and extended precision (80-bit) float- 
ing-point numbers. Any register may be used to exe- 
cute floating-point operations. 
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The processor provides hardware support for both 
mandatory and recommended portions of IEEE 
Standard 754 for floating-point arithmetic, including 
all arithmetic, exponential, logarithmic, and other 
transcendental functions. Table 3 shows execution 
times for some representative instructions. 

Table 3. Sample Floating-Point Execution 
Times (fis) at 25 MHz 



Add 


32-Bit 


64-Bit 


0.4 


0.5 


Subtract 


0.4 


0.5 


Multiply 


0.7 


1.3 


Divide 


1.3 


2.9 


Square Root 


3.7 


3,9 


Arctangent 


10.1 


13.1 


Exponent 


11.3 


12.5 


Sine 


15.2 


16.6 


Cosine 


15.2 


16.6 



Multitasking Support 

Multitasking programs commonly involve the moni- 
toring and control of an external operation, such as 
the activities of a process controller or the move- 
ments of a machine tool. These programs generally 
consist of a number of processes that run indepen- 
dently of one another, but share a common data- 
base or pass data among themselves. 

The 80960MC offers several hardware functions de- 
signed to support multitasking systems. One unique 
feature, called self-dispatching, allows a processor 
to switch itself automatically among scheduled 
tasks. When self-dispatching is used, all the operat- 
ing system is required to do is place the task in the 
scheduling queue. 

When the processor becomes available, it dis- 
patches the task from the beginning of the queue 
and then executes it until it becomes blocked, inter- 
rupted, or until its time-slice expires. It then returns 
the task to the end of the queue (i.e., automatically 
reschedules it) and dispatches the next ready task. 



During these operations, no communication be- 
tween the processor and the operating system is 
necessary until the running task is complete or an 
interrupt is issued. 



Synchronization and Communication 

The 80960MC also offers instructions to set up and 
test semaphores to ensure that concurrent tasks 
remain synchronized and no data inconsistency 
results. Special data structures, known as communi- 
cation ports, provide the means for exchanging 
parameters and data structures. Transmission of in- 
formation by means of communication ports is asyn- 
chronous and automatically buffered by the proces- 



Communication between tasks by means of ports 
can be carried out independently of the operating 
system. Once the ports have been set up by the 
programmer, the processor handles the message 
passing automatically. 



High Bandwidth Local Bus 

An 80960MC CPU resides on a high-bandwidth ad- 
dress/data bus known as the local bus (L-Bus). The 
L-Bus provides a direct communication path be- 
tween the processor and the memory and I/O sub- 
system interfaces. The processor uses the local bus 
to fetch instructions, manipulate memory, and re- 
spond to interrupts. Its features include: 

o 32-bit multiplexed address/data path 

o Four-word burst capability, which allows transfers 
from 1 to 1 6 bytes at a time 

o High bandwidth reads and writes at 66.7 MBytes 
per second 

o Special signal to indicate whether a memory 
transaction can be cached 

Figure 5 identifies the groups of signals which con- 
stitute the L-Bus. Table 4 lists the function of the L- 
Bus and other processor-support signals, such as 
the interrupt lines. 
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CONTROL (ADDRESS.DATA, and OPERATION SIGNALS- 15 LINES) 

< > 

ARBITRATION (2 LINES) 



271080-3 



Figure 5. Local Bus Signal Groups 



Multiple Processor Support 

One means of increasing the processing power of a 
system is to run two or more processors in parallel. 
Since microprocessors are not generally designed to 
run in tandem with other processors, designing such 
a system is usually difficult and costly. 

The 80960MC solves this problem by offering a 
number of functions to coordinate the actions of 
multiple processors. First, messages can be passed 
between processors to initiate actions such as flush- 
ing a cache, stopping or starting another processor, 
or preempting a task. The messages are passed on 
the bus and allow multiple processors to run togeth- 
er smoothly, with rare need to lock the bus or memo- 
ry. 

Second, a set of synchronization instructions help 
maintain the coherency of memory. These instruc- 
tions permit several processors to modify memory at 
the same time without inserting inaccuracies or am- 
biguities into shared data structures. 

The self-dispatching mechanism, in addition to being 
used in single-processor systems, provides the 
means to increase the performance of a system 
merely by adding processors. Each processor can 
either work on the same pool of tasks (sharing the 
same queue with other processors) or can be re- 
stricted to its own queue. 

When processors perform system operations, they 
synchronize themselves by using atomic operations 
and sending special messages between each other. 
And changing the number of processors in a system 



never requires a software change. Software will exe- 
cute correctly regardless of the number of proces- 
sors in the system; systems with more processors 
simply execute faster. 



Interrupt Handling 

The 80960MC can be interrupted in one of two 
ways: by the activation of one of four interrupt pins 
or by sending a message on the processor's data 
bus. 

The 80960MC is unusual in that it automatically han- 
dles interrupts on a priority basis and tracks pending 
interrupts through its on-chip interrupt controller. 
Two of the interrupt pins can be configured to pro- 
vide M8259A handshaking for expansion beyond 
four interrupt lines. 

An interrupt message is made up of a vector number 
and an interrupt priority. If the interrupt priority is 
greater than that of the currently running task, the 
processor accepts the interrupt and uses the vector 
as an index into the interrupt table. If the priority of 
the interrupt message is below that of the current 
task, the processor saves the information in a sec- 
tion of the interrupt table reserved for pending inter- 
rupts. 



Debug Features 

The 80960MC has built-in debug capabilities. There 
are two types of breakpoints and six different trace 
modes. The debug features are controlled by two 
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internal 32-bit registers, the Process-Controls Word 
and the Trace-Controls Word. By setting bits in 
these control words, a software debug monitor can 
closely control how the processor responds during 
program execution. 

The 80960MC has both hardware and software 
breakpoints. It provides two hardware breakpoint 
registers on-chip which can be set by a special com- 
mand to any value. When the instruction pointer 
matches the value in one of the breakpoint registers, 
the breakpoint will fire, and a breakpoint handling 
routine is called automatically. 

The 80960MC also provides software breakpoints 
through the use of two instructions, MARK and 
FMARK. These instructions can be placed at any 
point in a program and will cause the processor to 
halt execution at that point and call the breakpoint 
handling routine. The breakpoint mechanism is easy 
to use and provides a powerful debugging tool. 

Tracing is available for instructions (single-step exe- 
cution), calls and returns, and branching. Each dif- 
ferent type of trace may be enabled separately by a 
special debug instruction. In each case, the 
80960MC executes the instruction first and then 
calls a trace handling routine (usually part of a soft- 
ware debug monitor). Further program execution is 
halted until the trace routine is completed. When the 
trace event handling routine is completed, instruc- 
tion execution resumes at the next instruction. The 
80960MC's tracing mechanisms, which are imple- 
mented completely in hardware, greatly simplify the 
task of testing and debugging software. 



FAULT DETECTION 

The 80960MC has an automatic mechanism to 
handle faults. There are ten fault types including 
trace, arithmetic, and floating-point faults. When the 
processor detects a fault, it automatically calls the 
appropriate fault handling routine and saves the cur- 
rent instruction pointer and necessary state informa- 
tion to make efficient recovery possible. The proces- 
sor posts diagnostic information on the type of fault 
to a Fault Record. Like interrupt handling routines, 
fault handling routines are usually written to meet 
the needs of a specific application and are often in- 
cluded as part of the operating system or kernel. 

For each of the ten fault types, there are numerous 
subtypes that provide specific information about a 
fault. For example, a floating-point fault may have its 
subtype set to an Overflow or Zero-Divide fault. The 
fault handler can use this specific information to re- 
spond correctly to the fault. 



Interagenf Communications (IAC) 

In order to coordinate their actions, processors in a 
multiple processor system need a means for com- 
municating with each other. The 80960MC does this 
through a mechanism known as Interagent Commu- 
nication messages or lACs. 

IAC messages cause a variety of actions including 
starting and stopping processors, flushing instruc- 
tion caches and TLBs, and sending interrupts to oth- 
er processors in the system. The upper 16 Mbytes of 
the processor's physical memory space is reserved 
for sending and receiving IAC messages. 



BUILT-IN TESTABILITY 

Upon reset, the 80960MC automatically conducts an 
exhaustive internal test of its major blocks of logic. 

Then, before executing its first instruction, it does a 
zero check sum on the first eight words in memory 
to ensure that the system has been loaded correctly. 
If a problem is discovered at any point during the 
self-test, the 80960MC will assert its FAILURE pin 
and will not begin program execution. The self-test 
takes approximately 47,000 cycles to complete. 

System manufacturers can use the 80960MC's self- 
test feature during incoming parts inspection. No 
special diagnostic programs need to be written, and 
the test is both thorough and fast. The self-test ca- 
pability helps ensure that defective parts will be dis- 
covered before systems are shipped, and once in 
the field, the self-test makes it easier to distinguish 
between problems caused by processor failure and 
problems resulting from other causes. 



COMPATIBILITY WITH 80960K-SERIES 

Application programs written for the 80960K-Series 
microprocessors can be run on the 80960MC with- 
out modification. The 80960K-Series instruction set 
forms the core of the 80960MC's instructions, so bi- 
nary compatibility is assured. 
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CHMOS 

The 80960MC is fabricated using Intel's CHMOS IV 
(Complementary High Speed Metal Oxide Semicon- 
ductor) process. This advanced technology elimi- 
nates the frequency and reliability limitations of older 



CMOS processes and opens a new era in micro- 
processor performance. It combines the high per- 
formance capabilities of Intel's industry-leading 
HMOS technology with the high density and low 
power characteristics of CMOS. The 80960MC is 
available at 16, 20 and 25 MHz. 



Table 4a. 80960MC Pin Description: L-Bus Signals 



Symbol 


Type 


Name and Function 


CLK2 


I 


SYSTEM CLOCK provides the fundamental timing for 80960MC systems. It is 
divided by two inside the 80960MC to generate the internal processor clock. CLK2 
is shown in Figure 9. 


LAD 31 
-LAD 


I/O 
T.S. 


LOCAL ADDRESS/DATA BUS carries 32-bit physical addresses and data to and 
from memory. During an address (T a ) cycle, bits 2-31 contain a physical word 
address (bits 0-1 indicate SIZE; see below). During a data (Td) cycle, bits 0-31 
contain read or write data. The LAD lines are active HIGH and float to a high 
impedance state when not active. 

SIZEj which is comprised of bits 0-1 of the LAD lines during a T a cycle, specifies 
the size of a transfer in words for a burst transaction. 
LAD 1 LAD 

1 Word 

1 2 Words 

1 3 Words 
1 1 4 Words 


ALE 



T.S. 


ADDRESS-LATCH ENABLE indicates the transfer of a physical address. ALE is 
asserted during a T a cycle and deasserted before the beginning of the Td state. It 
is active LOW and floats to a high impedance state when the processor is idle or 
is at the end of any bus access. 


ADS 



O.D. 


ADDRESS STATUS indicates an address state. ADS is asserted every T a state 
and deasserted during the the following Td state. For a burst transaction, ADS is 


asserted again every Td state where READY was asserted in the previous cycle. 


W/R 



O.D. 


WRITE/READ specifies, during a T a cycle, whether the operation is a write or 
read. It is latched on-chip and remains valid during Td and T w states. 


DT/R 



O.D. 


DATA TRANSMIT/RECEIVE indicates the direction of data transfer to and from 
the L-Bus. It is low during T a , T w and Td cycles for a read or interrupt _ 
acknowledgement; it is high during T a , T w and Td cycles for a write. DT/R never 
changes state when DEN is asserted (see Timing Diagrams). 


DEN 



O.D. 


DATA ENABLE is asserted during Td and T w cycles and indicates transfer of data 
on the LAD bus lines. 




I 




READY 


READY indicates that data on LAD lines can be sampled or removed. If READY is 
not asserted during a Td cycle, the Td cycle is extended to the next cycle by 
inserting wait states (T w ), and ADS is not asserted in the next cycle. 




I/O 
O.D. 


BUS LOCK prevents other bus masters from gaining control of the L-Bus 
following the current cycle (if they would assert LOCK to do so). LOCK is used by 
the processor or any bus agent when it performs indivisible Read/Modify/Write 
(RMW) operations. 


LOCK 


For a read that is designated as a RMW-read, LOCK is examined, if asserted, the 
processor waits until it is not asserted; if not asserted, the processor asserts 
LOCK during the T a cycle and leaves it asserted. 


A write that is designated as an RMW-write deasserts LOCK in the T a cycle. 



I/O = Input/Output, O = Output, I = Input, O.D. = Open-Drain, T.S. = three state 
T a = TAddress. T d = Toata. T W = Twait. T r =.T Re covery. Tj = T|d| e , Th = T Ho id 
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Table 4a. 80960MC Pin Description: L-Bus Signals (Continued) 



Symbol 


Type 


Name and Function 


BE^-BE^ 


O 
O.D. 


BYTE ENABLE LINES specify which data bytes (up to four) on the bus take part 
in the current bus cycle. BE3 corresponds to LAD31-LAD24 and BEq corresponds 
toLADy-LADrj. 

The byte enables are provided in advance of data. The byte enables asserted 
during T a specify the bytes of the first data word. The byte enables asserted 
during Tj specify the bytes of the next data word (if any), that is, the word to be 
transmitted following the next assertion of READY. The byte enables during the 
Td cycles preceding the last assertion of READY are undefined. The byte enables 
are latched on-chip and remain constant from one T^ cycle to the next when 
READY is not asserted. 

For reads, the byte enables specify the byte(s) that the processor will actually use. 
80960MC's will assert only adjacent byte enables (e.g., asserting just BEn and 
BE2 is not permitted), and are required to assert at least one byte enable. 
Accesses must also be naturally aligned (e.g., asserting BE1 and BE2 is not 
allowed even though they are adjacent). To produce address bits An and A1 
externally, they can be decoded from the byte enables. 


HOLD 
(HLDAR) 


1 


HOLD indicates a request from a secondary bus master to acquire the bus. If the 
processor is initialized as the primary bus master this input will be interpreted as 
HOLD. When the processor receives HOLD and grants another master control of 
the bus, it floats its three-state bus lines, asserts HOLD ACKNOWLEDGE, and 
enters the T n state. When HOLD is deasserted, the processor will deassert HOLD 
ACKNOWLEDGE and go to either the T, or T a state. 

HOLD ACKNOWLEDGE RECEIVED indicates that the processor has acquired 
the bus. If the processor is initialized as the secondary bus master this input is 
interpreted as HLDAR. 

HOLD timing is shown in Figure 11. 


HLDA 
(HOLDR) 



T.S. 


HOLD ACKNOWLEDGE relinquishes control of the bus to another bus master. If 
the processor is initialized as the primary bus master this output will be interpreted 
as HLDA. When HOLD is deasserted, the processor will deassert HLDA and go to 
either the Tj or T a state. 

HOLD REQUEST indicates a request to acquire the bus. If the processor is 
initialized as the secondary bus master this output will be interpreted as HOLDR. 

HOLD timing is shown in Figure 1 1 . 


CACHE 



T.S. 


CACHE indicates if an access is cacheable during a T a cycle. The CACHE signal 
floats to a high impedance state when the processor is idle. 



m 

m 



I/O = Input/Output, O = Output, I = Input, O.D. = Open-Drain, T.S. = three state 
T a = TAddress. T d = T Data , T w = Twait. T r = TR eC overy» t j = T|d| e . T h = T Ho |d 
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Table 4b. 80960MC Pin Description: Module Support Signals 



Symbol 


Type 


Name and Function 


BADAC 


1 


BAD ACCESS, if asserted in the cycle following the one in which the last READY 
of a transaction is asserted, indicates that an unrecoverable error has occurred on 
the current bus transaction, or that a synchronous load/store instruction has not 
been acknowledged. 


STARTUP: During system reset, the BADAC signal is interpreted differently. If the 
signal is high, it indicates that this processor will perform system initialization. If it 
is low, another processor in the system will perform system initialization instead. 


RESET 


1 


RESET clears the internal logic of the processor and causes it to re-initialize. 


During RESET assertion, the input pins are ignored (except for BADAC and 
IAC/ INTo), tne tri-state output pins are placed in a high impedance state, and 
other output pins are placed in their non-asserted state. 

RESET must be asserted for at least 41 CLK2 cycles for a predictable RESET. 
The HIGH to LOW transition of RESET should occur after the rising edge of both 
CLK2 and the external bus CLK, and before the next rising edge of CLK2. 

RESET timing is shown in Figure 10. 


FAILURE 



O.D. 


INITIALIZATION FAILURE indicates that the processor has failed to initialize 
correctly. After RESET is deasserted and before the first bus transaction begins, 
FAILURE is asserted while the processor performs a self-test. If the self-test 
completes successfully, then FAILURE is deasserted. Next, the processor 


performs a zero checksum on the first eight words of memory. If it fails, FAILURE 
is asserted for a second time and remains asserted; if it passes, system 


initialization continues and FAILURE remains deasserted. 


N.C. 


N/A 


NOT CONNECTED indicates pins should not be connected. Never connect any 
pin marked N.C. 


IAC 
(INTq) 


1 


INTERAGENT COMMUNICATION REQUEST/INTERRUPT indicates either 
that there is a pending IAC message for the processor or an interrupt. The bus 
interrupt control register determines in which way the signal should be interpreted. 
To signal an interrupt or IAC request in a synchronous system, this pin (as well as 
the other interrupt pins) must be enabled by being deasserted for at least one bus 
cycle and then asserted for at least one additional bus cycle; in an asynchronous 
system, the pin must remain deasserted for at least two bus cycles and then be 
asserted for at least two more bus cycles. 

LOCAL PROCESSOR NUMBER: This signal is interpreted differently during 
system reset. If the signal is at a high voltage level, it indicates that this processor 
is a primary bus master (Local Processor Number = 0); if it is at a low voltage 
level, it indicates that this processor is a secondary bus master (Local Processor 
Number = 1). 


INT-, 


1 


INTERRUPT 1, like INTo, provides direct interrupt signaling. 


INT 2 
(INTR) 


1 


INTERRUPT 2/INTERRUPT REQUEST: The bus control registers determines 
how this pin is interpreted. If INT2, it has the same interpretation as the INTq and 
INT1 pins. If INTR, it is used to receive an interrupt request from an external 
interrupt controller. 


INTi 
(INTA) 


I/O 
O.D. 


INTERRUPT 3/INTERRUPT ACKNOWLEDGE: The bus interrupt control register 
determines how this pin is interpreted. If INT3, it has the same interpretation as 
the INTq, INT-i, and INT2 pins. If INTA, it is used as an output to control interrupt- 
acknowledge bus transactions. The INTA output is latched on-chip and remains 
valid during Td cycles; as an output, it is open-drain. 



I/O = Input/Output, O = Output, I = Input, O.D. = Open-Drain, T.S. = three state 
T a = T Address. Td = Tpata. T W = Twait. T r = T ReC overy. T i = T ldle. T h = T|-iold 
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ELECTRICAL SPECIFICATIONS 
Power and Grounding 

The 80960MC is implemented in CHMOS III technol- 
ogy and has modest power requirements. Its high 
clock frequency and numerous output buffers (ad- 
dress/data, control, error and arbitration signals) 
can cause power surges as multiple output buffers 
drive new signal levels simultaneously. For clean on- 
chip power distribution at high frequency, 12 Vcc 
and 13 Vss pins separately feed functional units of 
the 80960MC. 

Power and ground connections must be made to all 
power and ground pins of the 80960MC. On the cir- 
cuit board, all Vcc pins must be strapped closely 
together, preferably on a power plane. Likewise, all 
Vss P ins should be strapped together, preferably on 
a ground plane. 



Power Decoupling Recommendations 

Liberal decoupling capacitance should be placed 
near the 80960MC. The processor can cause tran- 
sient power surges when driving the L-Bus, particu- 
larly when it is connected to a large capacitive load. 

Low inductance capacitors and interconnects are 
recommended for best high frequency electrical per- 
formance. Inductance can be reduced by shortening 
the board traces between the processor and decou- 
pling capacitors as much as possible. 



one or more interrupt lines are not used, they should 
be pulled up or down to their respective deasserted 
states. No inputs should ever be left floating. 

All open-drain outputs require a pullup device. While 
in some cases a simple pullup resistor will be ade- 
quate, we recommend a network of pullup and pull- 
down resistors biased to a valid Vm (^ 3.4V) and 
terminated in the characteristic impedance of the cir- 
cuit board. Figure 6 shows our recommendations for 
the resistor values for both a low and high current 
drive network, which assumes that the circuit board 
has a characteristic impedance of 100H. The advan- 
tage of terminating the output signals in this fashion 
is that it limits signal swing and reduces AC power 
consumption. 



Characteristic Curves 

Figure 7 shows the typical supply current require- 
ments over the operating temperature range of the 
processor at supply voltage (Vcc) of 5V. Figure 8 
shows the typical power supply current (Ice) re- 
quired by the 80960MC at various operating fre- 
quencies when measured at three input voltage 
(V C c) levels. 

Figure 9 shows the typical capacitive derating curve 
for the 80960MC measured from 1 .5V on the system 
clock (CLK) to 0.8V on the falling edge and 2.0V on 
the rising edge of the L-Bus address/data (LAD) sig- 
nals. 




Connection Recommendations 

For reliable operation, always connect unused in- 
puts to an appropriate signal level. In particular, if 



80960MC 

OPEN-DRAIN 

OUTPUT 




180X1 



390.0. 



271080-27 



Low Drive Network: 
o Vqh = 3.42V 
• Iql = 25.3 mA 



80960MC 

OPEN-DRAIN 

OUTPUT 




13011 



280X1 



High Drive Network: 

• Vqh = 3.41V 

• Iql = 33.8 mA 



Figure 6. Connection Recommendations for Low and High Current Drive Networks 
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Figure 7. Typical Supply Current (Ice) 

Test Load Circuit 

Figure 10 illustrates the load circuit used to test the 
80960MC's tristate pins, and Figure 11 shows the 
load circuit used to test the open drain outputs. The 
open drain test uses an active load circuit in the form 
of a matched diode bridge. Since the open-drain out- 
puts sink current, only the Iql legs of the bridge are 
necessary and the Iqh legs are not used. When the 
80960MC driver under test is turned off, the output 
pin is pulled up to Vref (••©.• Voh)- Diode D-| is 
turned off and the Iql current source flows through 
diode D2. 

When the 80960MC open-drain driver under test is 
on, diode D^ is also on, and the voltage on the pin 
being tested drops to Vol- Diode D2 turns off and 
Iql flows through diode D1 . 



80960MC 
TRISTATE OUTPUT 

o— — 



V 

271080-32 



Figure 10. Test Load Circuit for 
TRI-STATE Output Pins 
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Figure 8. Typical Current vs Frequency 
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Figure 9. Capacltive Derating Curve 
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Figure 11. Test Load Circuit for Open-Drain Output Pins 
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ABSOLUTE MAXIMUM RATINGS" 

Case Temperature 

under Bias(?) -55°C to + 125°C 

Storage Temperature - 65°C to + 1 50°C 

Voltage on Any Pin -0.5V to V C c + 0.5V 

Power Dissipation 2.6W (25 MHz) 



NOTICE: This data sheet contains information on 
products in the sampling and initial production phases 
of development. The specifications are subject to 
change without notice. Verify with your local Intel 
Sales office that you have the latest data sheet be- 
fore finalizing a design. 



* WARNING: Stressing the device beyond the "Absolute 
Maximum Ratings" may cause permanent damage. 
These are stress ratings only. Operation beyond the 
"Operating Conditions" is not recommended and ex- 
tended exposure beyond the "Operating Conditions" 
may affect device reliability. 



D.C. CHARACTERISTICS 

80960MC: T C ASE (6) = -55°Cto + 125°C, V C c = 5V ± 5% 



Symbol 


Parameter 


Min 


Max 


Units 


Test Conditions 


V|L 


Input Low Voltage 


-0.3 


+ 0.8 


V 




V| H 


Input High Voltage 


2.0 


V CC + 0.3 


V 




V C L 


CLK2 Input Low Voltage 


-0.3 


+ 0.8 


V 




V C H 


CLK2 Input High Voltage 


0.55 V CC 


V CC + 0.3 


V 




Vol 


Output Low Voltage 




0.45 


V 


(1,5) 


V H 


Output High Voltage 


2.4 




V 


(2,4) 


Ice 


Power Supply Current: 
16 MHz 
20 MHz 
25 MHz 




375 
420 
480 


mA 
mA 
mA 




Ili 


Input Leakage Current 




+ 15 


JLtA 


o < v <; v cc 


Ilo 


Output Leakage Current 




+ 15 


jllA 


' 0.45 <> V <: Vcc 


C| N 


Input Capacitance 




10 


PF 


f C = 1 MHz(3) 


Co 


I/O or Output Capacitance 




12 


PF 


f C = 1 MHz(3) 


CfJLK 


Clock Capacitance 




10 


PF 


f c = 1 MHzO) 


0JA 


Thermal Resistance 
(Junction-to-Ambient) 

Pin Grid Array 

Ceramic Quad Flatpack 




21 
29 


°C/W 
°C/W 




0JC 


Thermal Resistance 
(Junction-to-Case) 

Pin Grid Array 

Ceramic Quad Flatpack 




4 
8 


°C/W 

°c/w 






NOTES: 

1 . For three-state outputs, this parameter is measured at: 

Address/Data . . . 4.0 mA 

Controls ....5.0 mA 

2. This parameter is measured at: 

Address/Data — '.'. ■".:".-. .-1.0 mA 

Controls -0.9 mA 

ALE -5.0 mA 

3. Input, output, and clock capacitance are not tested. 

4. Not measured on open-drain outputs. » . 

5. For open-drain outputs 25 mA 

6. Case temperatures are "instant on". 
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AC SPECIFICATIONS 

This section describes the AC specifications for the 
80960MC pins. All input and output timings are 
specified relative to the 1 .5 V level of the rising edge 
of CLK2, and refer to the time at which the signal 



reaches (for output delay and input setup) or leaves 
(for hold time) the TTL levels of LOW (0.8V) or HIGH 
(2.0V). All AC testing should be done with input volt- 
ages of 0.4V and 2.4V, except for the clock (CLK2), 
which should be, tested with input voltages of 0.45V 
and 0.55 Vcc- 



EDGE 



CLK2 



OUTPUTS: 
LADji-LADq, 

ADS, 

W/R.DEN, 

BEj-BEoi 

HLDA/H0LDR, 

CACHE 

L0CK.INTA 



ALE 



DT/R 



INPUTS: 

LAD 31 -LAD , 



»/~. \_u 



BADAC, 

IAC/INTo.INTj, £ 

INT 2 /INTR,iNT 3 



HOLD.HLDAR, 

LOCK, 

READY 
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NOTE 1: 

For Tri-State pins, T6 and Tg are measured at 1 .5V. 
For Open-Drain pins, Tq is measured at 1.5V, Tg at 0.8V. 



Figure 12. Drive Levels and Timing Relationships for 80960MC Signals 
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Figure 13. Timing Relationship of L-Bus Signals 
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A.C. Specification Tables 

80960MC A.C. Characteristics (16 MHz) 
TcASE (3) = -55°CtO + 125°C, V C c = 5V ±5% 



Symbol 


Parameter 


Min 


Max 


Units 


Test Conditions 


Ti 


Processor Clock 
Period (CLK2) 


31.25 


125 


ns 


V| N =1.5V 


T 2 


Processor Clock 
Low Time (CLK2) 


8 




ns 


V| L = 10% Point 
= 1.2V 


T 3 


Processor Clock 
High Time (CLK2) 


8 




ns 


V| H ■= 90% Point 

= 0.1V + 0.5 V C c 


T 4 


Processor Clock 
Fall Time (CLK2) 




10 


ns 


V| N = 90% Point to 10% 
Point 


T 5 


Processor Clock 
Rise Time (CLK2) 




10 


ns 


V|n = 10% Point to 90% 
Point 


T 6 


Output Valid 
Delay 


2 


25 


ns 


C L = 100 pF (LAD) 
C L = 75 pF (Controls) 


T 6 h 


HOLDA Output 
Valid Delay 


4 


31 


ns 


C L = 75 pF 


T 7 " 


ALE Width 


15 




ns 


C L = 75 pF 


T 8 


ALE Invalid Delay 





20 


ns 


C L = 75 pF(2) 


T 9 


Output Float 
Delay 


2 


20 


ns 


C L = 100pF(LAD) , 
C L = 75 pF (Coritrols)<2) 


TgH 


HOLDA Output 
Float Delay 


4 


20 


ns 


C L = 75 pF 


T10 


Input Setup 1 


3 




ns 


(Notel) 


T11 


Input Hold 


5 




ns 


(Notel) 


Thh 


HOLD Input Hold 


4 




ns 




T12 


Input Setup 2 


8 




ns 




T13 


Setup to ALE 
Inactive 


10 




ns 


C L = 100 pF (LAD) 
Cl = 75 pF (Controls) 


T14 


Hold after ALE 
Inactive 


8 




ns 


C L = 100pF(LAD) 
C L = 75 pF (Controls) 


T15 


Reset Hold 


3 




ns 




Tie 


Reset Setup 


5 




ns 




T17 


Reset Width 


1281 




ns 


41 CLK2 Periods Minimum 



NOTES: 

1. lAC/INTo, INTj, INT 2 /INTR, INT 3 can be asynchronous. 

2. A float condition occurs when the maximum output current becomes less than Ilo- Float delay is not tested, but should be 
no longer than the valid delay. 

3. Case temperatures are "instant on". 
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A.C. Specification Tables (Continued) 

80960MC A.C. Characteristics (20 MHz) 
TcASE (3) = -55°Cto +125°C, V C c = 5V ±5% 



Symbol 


Parameter 


Min 


Max 


Units 


Test Conditions 


Ti 


Processor Clock 
Period (CLK2) 


25 


125 


ns 


Vim - 1.5V 


T 2 


Processor Clock 
Low Time (CLK2) 


6 




ns 


V| L = 10% Point 
= 1.2V 


T 3 


Processor Clock 
High Time (CLK2) 


6 




ns 


V| H = 90% Point 

= 0.1V + 0.5 V C c 


T 4 


Processor Clock 
Fall Time (CLK2) 




10 


ns 


V| N = 90% Point to 10% 
Point 


T 5 


Processor Clock 
Rise Time (CLK2) 




10 


ns 


V| N = 10% Point to 90% 
Point 


T 6 


Output Valid 
Delay 


2 


20 


ns 


C L = 60 pF (LAD) 
C L = 50 pF (Controls) 


T6H 


HOLDA Output 
Valid Delay 


4 


26 


ns 


C L = 50 pF 


T 7 


ALE Width 


12 




ns 


C L = 50 pF 


T 8 


ALE Invalid Delay 





20 


ns 


C L =50pF(2) 


Tg 


Output Float 
Delay 


2 


20 


ns 


C L = 60 pF (LAD) 

C L = 50 pF (Controls)(2) 


T 9H 


HOLDA Output 
Float Delay 


4 


20 


ns 


C L = 50 pF 


T10 


Input Setup 1 


3 




ns 


(Notel) 


T11 


Input Hold 


5 




ns 


(Notel) 


T11H 


HOLD Input Hold 


4 




ns 




T12 


Input Setup 2 


7 




ns 




T13 


Setup to ALE 
Inactive 


10 




ns 


C L = 60pF'(LAD) 
C L = 50 pF (Controls) 


T14 


Hold after ALE 
Inactive 


8 




ns 


C L = 60 pF (LAD) 
C L = 50 pF (Controls) 


T15 


Reset Hold 


3 




ns 




Tie 


Reset Setup 


5 




ns 




T17 


Reset Width 


1025 




ns 


41 CLK2 Periods Minimum 



NOTES: 

1. lAC/INTo, INTl INT 2 /INTR, INT 3 can be asynchronous. 

2. A float condition occurs when the maximum output current becomes less than l L o Float delay is not tested, but should be 
no longer than the valid delay. 

3. Case temperatures are "instant on". 



3-257 



intel 



80960MC 



A®mM©I OMFOHMATD©^ 



A.C. Specification Tables (Continued) 

80960MC A.C. Characteristics (25 MHz) 
TCASE (3) = -55°Cto +125°C, V C c = 5V ±5% 



Symbol 


Parameter 


Min 


Max 


Units 


Test Conditions 


Ti 


Processor Clock 
Period (CLK2) 


20 


125 


ns 


V| N = 1.5V 


T 2 


Processor Clock 
Low Time (CLK2) 


5 




ns 


V| L = 10% Point 
= 1.2V 


T 3 


Processor Clock 
High Time (CLK2) 


5 




ns 


V| H = 90% Point 
= 0.1V + 0.5 V C c 


T 4 


Processor Clock 
Fall Time (CLK2) 




10 


ns 


V| N =,90% Point to 
10% Point 


T 5 


Processor Clock 
Rise Time (CLK2) 




10 


ns 


V|N = 10% Point to 
90% Point 


T 6 


Output Valid 
Delay 


2 


19 


ns 


. C L = 60.pF(LAD) 
C L = 50 pF (Controls) 


T6H 


HOLDA Output 
Valid Delay 


4 


24 


ns 


C L = 50 pF 


T 7 


ALE Width 


12 




ns 


C L = 50 pF 


T 8 


ALE Invalid 
Delay 





20 


ns 


C L = 5.0 pF(2) 


T 9 


Output Float 
Delay 


2 


19 


ns 


C L = 60 pF (LAD) 

C L = 50 pF (Controls)® . 


T 9H 


HOLDA Output 
Float Delay 


4 


20 


ns 


C L =50pF 


T10 


Input Setup 1 


3 




ns 


(Notel) 


T11 


Input Hold 


5 




ns 


(Notel) 


T 11H 


HOLD Input Hold 


4 




ns 




T12 


Input Setup 2 


7 




ns 




T 13 


Setup to ALE 
Inactive 


8 




ns 


C L ■= 60 pF (LAD) 
C L = 50 pF (Controls) 


■Ti4 


. Hold after ALE 
Inactive 


8 




ns 


C L = 60pF(LAD) 
C L = 50 pF (Controls) 


T15 


Reset Hold 


3 




ns 




T16 


Reset Setup 


5 




ns 




T17 


Reset Width 


820 




ns 


41 CLK2 Periods Minimum 



NOTES: 

1. lAC/INTo, INT-,, INT 2 /INTR, INT 3 can be asynchronous. 

2. A float condition occurs when the maximum output current becomes less than Ilo- Float delay is not tested, but should be 
no longer than the valid delay. 

3. Case temperatures are "instant on". 
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HIGH LEVEL (MIN) 0.55V CC 



1.5 V 



LOW LEVEL (MAX) 0.8V 




-» '3 



90% 



10% 




'5 Kl X 



U tf » 



Figure 14. Processor Clock Pulse (CLK2) 
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Figure 15. RESET Signal Timing 
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Figure 16. Hold Timing 



Design Considerations 

Input hold times can be disregarded by the designer 
whenever the input is removed because a subse- 
quent output from the processor is deasserted (e.g., 
DEN becomes deasserted). 

In other words, whenever the processor generates 
an output that indicates a transition into a subse- 
quent state, the processor must have sampled any 
inputs for the previous state. 

Similarly, whenever the processor generates an out- 
put that indicates a transition into a subsequent 
state, any outputs that are specified to be three stat- 
ed in this new state are guaranteed to be three stat- 
ed. 



Designing for the ICE-960MC 

The 80960MC In-Circuit Emulator assists in debug- 
ging 80960 MC hardware and software designs. The 
product consists of a probe module, cable, and con- 
trol unit. Because of the high operating frequency of 
80960MC systems, the probe module connects di- 
rectly to the 80960MC socket. 



When designing an 80960MC hardware system that 
uses the ICE-960MC to debug the system, several 
electrical and mechanical characteristics should be 
considered. These considerations include capacitive 
loading, drive requirement, power requirement, and 
physical layout. 

The ICE-960MC probe module increases the load 
capacitance of each line by up to 25 pF. It also adds 
one standard Schottky TTL load on the CLK2 line, 
up to one advanced low-power Schottky TTL load 
for each control signal line, and one advanced low- 
power Schottky TTL load for each address/data and 
byte enable line. These loads originate from the 
probe module and are driven by the 80960MC proc- 
essor. 

To achieve high noise immunity, the ICE-960MC 
probe is powered by the user's system. The high- 
speed probe circuitry draws up to 1 .1 A plus the max- 
imum current (Ice) of tne 80960MC processor. 

The mechanical considerations are shown in Figure 
17, which illustrates the lateral clearance require- 
ments for the ICE-960MC probe as viewed from 
above the socket of the 80960MC processor. 
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VERTICAL 
CLEARANCE 1.2" 



VIEW FROM 

ABOVE USER CPU 

SOCKET 



MINIMUM CABLE 
BEND RADIUS: 
LESS THAN 3.0" 





Figure 17. ICE-960MC Lateral Clearance Requirements 



MECHANICAL DATA 



Pin Assignment 

The 80960MC is packaged in a 132-lead ceramic pin 
grid array and a 1 64-lead ceramic quad f latpack. The 
80960MC pin grid array pinout as viewed from the 
substrate side of the component is shown in Figure 
18 and from the pin side in Figure 19. The 80960MC 
ceramic quad flatpack pinout as viewed from the top 
of the package is shown in Figure 20. 

Vcc and GND connections must be made to multi- 
ple Vcc and GND pins. Each Vcc and GND pin must 
be connected to the appropriate voltage or ground 
and externally strapped close to the package. Pref- 
erably, the circuit board should include power and 
ground planes for power distribution. Tables 5, 6, 7 
and 8 list the function of each pin. 

NOTE: 

Pins identified as N.C., "No Connect," should never 
be connected under any circumstances. 



Package Dimensions and Mounting 

Pins in the pin grid array package are arranged 
0.100 inch (2.54mm) center-to-center, in a 14 by 14 
matrix, three rows around. (See Figure 21.) 

A wide variety of available sockets allow low-inser- 
tion or zero-insertion force mountings, and a choice 
of terminals such as soldertail, surface mount, or 
wire wrap. Several applicable sockets are shown in 
Figure 22. 



Package Thermal Specif ication 

The 80960MC is specified for operation when its 
case temperature is within the range of -55°C to 
+ 125°C. The PGA case temperature should be 
measured at the center of the top surface opposite 
the pins as shown in Figure 23. The ceramic quad 
flatpack case temperature should be measured at 
the center of the lid on the top surface of the pack- 
age. 



WAVEFORMS 

Figures 24 through 30 show the waveforms for vari- 
ous transactions on the 80960MC's local bus. 
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Figure 18. MG80960MC Pinout— View from Top (Pins Facing Down) 
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Figure 19. MG80960MC Pinout— View from Bottom (Pins Facing Up) 
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(Staggered pin arrangement is shown for clarity only. Actual package has pins of equal length.) 



140 NC 



134 NC 
132 NC 



130 NC 



122 NC 
120 V cc 



112 NC 
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Figure 20. MQ80960MC Pinout— View from Top of Package 



3-264 



80960MC 



&mM\m m^mM/Mmm 



Table 5. MG80960MC (PGA) Pinout— In Pin Order 


Pin 


Signal 


Pin 


Signal 


Pin 


Signal 


Pin 


Signal 


A1 


Vcc 


C6 


LAD 20 


H1 


W/R 


M10 


Vss 


A2 • 


Vss 


C7 


LAD13 


H2 


BEo 


M11 


Vcc 


A3 


LAD! 9 


C8 


LAD 8 


H3 




M12 


N.C. 


LOCK 


A4 


LAD 17 


C9 


LAD 3 


H12 


N.C 


M13 


N.C. 


A5 


n LAD 16 


C10 


Vcc 


H13 


N.C. 


M14 


N.C. 


A6 


LAD 14 


C11 


Vss 


H14 


N.C. 


N1 


Vss 


A7 


LADu 


C12 


INT3/INTA 


J1 


DT/R 


N2 


N.C. 


A8 


LAD 9 


C13 


INTi 


J2 


BE 2 


N3 


N.C. 


A9 


LAD 7 


C14 


IAC/INTo 


J3 


Vss 


N4 


N.C. 


A10 


LAD 5 


D1 


ALE 


J12 


N.C 


N5 


N.C. 


A11 


LAD 4 


D2 


ADS 


J13 


N.C. 


N6 


N.C. 


A12 


LAD! 


D3 


HLDA/HLDR 


J14 


N.C. 


N7 


N.C. 


A13 


INT 2 /INTR 


D12 


Vcc 


K1 


BE 3 


N8 


N.C. 


A14 


Vcc 


D13 


N.C. 


K2 




N9 


N.C. 


FAILURE 


B1 


LAD 23 


D14 


N.C. 


K3 


Vss 


N10 


N.C. 


B2 


LAD 24 


E1 


LAD 28 


K12 


Vcc 


N11 


N.C. 


B3 


LAD22 


E2 


LAD 26 


K13 


N.C. 


N12 


N.C. 


B4 


LAD 21 


E3 


LAD 27 


K14 


N.C. 


N13 


N.C. 


B5 


LAD! 8 


E12 


N.C. 


L1 


DEN 


N14 


N.C. 


B6 


LAD 15 


E13 


Vss 


L2 


N.C. 


P1 


Vcc 


B7 


LAD! 2 


E14 


N.C. 


L3 


Vcc 


P2 


N.C. 


B8 


LAD 10 


■F1 


LAD 29 


L12 


Vss 


P3 


N.C. 


B9 


LAD 6 


F2 


LAD31 


L13 


N.C. 


P4 


N.C. 


B10 


LAD 2 


F3 


CACHE 


L14 


N.C. 


P5 


N.C. 


B11 


CLK2 


F12 


N.C. 


M1 


N.C. 


P6 


N.C. 


B12 


LAD 


F13 


N.C. 


M2 


Vcc 


P7 


N.C. 


B13 


RESET 


F14 


N.C. 


M3 


Vss 


P8 


N.C. 


B14 


Vss 


G1 


LAD30 


M4 


Vss 


P9 


N.C. 


C1 


HOLD/HLDAR 


G2 




M5 


Vcc 


P10 


N.C. 


READY 


C2 


LAD 25 


G3 


BE1 


M6 


N.C. 


P11 


N.C. 


C3 




G12 


N.C. 


M7 


N.C. 


P12 


N.C. 


BADAC 


C4 


Vcc 


G13 


N.C. 


M8 


N.C. 


P13 


Vss 


C5 


Vss 


G14 


N.C. 


M9 


N.C. 


P14 


Vcc 



NOTE: 

Pins identified as N.C. ("No Connect") should never be connected under any circumstances. 
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Table 6. MG80960MC (PGA) Pinout— In Signal Order 


Signal 


Pin 


Signal 


Pin 


Signal 


Pin 


Signal 


Pin 


ADS 


D2 


LAD 15 


B6 


N.C. 


J14 


N.C. 


P9 


ALE 


D1 


LAD 16 


A5 


N.C. 


K13 


N.C. 


P10 




C3 


LAD 17 


A4 


N.C. 


K14 


N.C. 


P11 


BADAC 


BE^ 


H2 


LAD 18 


B5 


N.C. 


L13 


N.C. 


P12 


BE? 


G3 


LAD! 9 


A3 


N.C. 


L14 


N.C. 


L2 


BEi 


J2 


LAD20 


C6 


N.C. 


M1 




G2 


READY 


BEi 


K1 


LAD21 


B4 


N.C. 


M6 


RESET 


B13 


CACHE 


F3 


LAD22 


B3 


N.C. 


M7 


v C c 


A1 


CLK2 


B11 


LAD23 


B1 


N.C. 


M8 


v C c 


A14 


DEN 


L1 


LAD 24 


B2 


N.C. 


M9 


v C c 


C4 


DT/R 


J1 


LAD 25 


C2 


N.C. 


M12 


v C c 


C10 




K2 


LAD 26 


E2 


N.C. 


M13 


v C c 


D12 


FAILURE 


HLDA/HOLDR 


D3 


LAD 27 


E3 


N.C. 


M14 


v C c 


K12 


HOLD/HLDAR 


C1 


LAD 28 


E1 


N.C. 


N2 


v C c 


L3 


lAC/INTo 


C14 


LAD 29 


F1 


N.C. 


N3 


Vcc 


M2 


INTv 


C13 


LAD30 


G1 


N.C. 


N4 


Vcc 


M5 


INT 2 /INTR 


A13 


LAD 31 " 


F2 


N.C. 


N5 


VCC 


M11 


INT3/INTA 


C12 




H3 


N.C. 


N6 


Vcc 


P1 


LOCK 


LAD 


B12 


N.C. 


D13 


N.C. 


N7 


Vcc 


P14 


LAD-, 


A12 


N.C. 


D14 


N.C. 


N8 


v S s 


A2 


LAD 2 


B10 


N.C. 


E12 


N.C. 


N9 


Vss 


B14 


LAD 3 


C9 


N.C. 


E14 


N.C. 


N10 


v S s 


C5 


LAD 4 


A11 


N.C. 


F12 


N.C. 


N11 


v S s 


C11 


LAD 5 


A10 


N.C. 


F13 


N.C. 


N12 


Vss 


E13 


LAD 6 


B9 ■ 


N.C. 


F14 


N.C. 


N13 


Vss 


J3 


LAD 7 


A9 


N.C. 


G12 


N.C. 


N14 


Vss 


K3 


LAD 8 


C8 


N.C. 


G13 


N.C. 


P2 


Vss 


L12 


LAD 9 


A8 


N.C. 


G14 


N.C. 


P3 


Vss 


M3 


LAD 10 


B8 


N.C. 


H12 


N.C. 


P4 


Vss 


M4 


LADu 


A7 


N.C. 


H13 


N.C. 


P5 


Vss 


M10 


LAD-, 2 


B7 


N.C. 


H14. 


N.C. 


P6 


Vss 


N1 


LAD! 3 


C7 


N.C. 


J12 


N.C. 


P7 


v ss 


P13 


LAD 14 


A6 


N.C. 


J13 


N.C. 


P8 


W/R 


H1 



NOTE: 

Pins identified as N.C. 



("No Connect") should never be connected under any circumstances. 
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Table 7. MQ80960MC (CQP) Pinout— In Pin Order 


Pin 


Signal 


Pin 


Signal 


Pin 


Signal 


Pin 


Signal 


1 


BE^ 


42 


LADu 


83 


N.C. 


124 


N.C. 


2 


BE^ 


43 


LAD 12 


84 


Vcc 


125 


Vss 


3 


READY 


44 


LAD 9 


85 


N.C. 


126 


Vcc 


4 


BET 


45 


LAD 10 


86 


N.C. 


127 


N.C. 


5 


CACHE 


46 


LAD 7 


87 


Vss 


128 


N.C. 


6 


DT/R 


47 


LAD 8 


88 


N.C. 


129 


N.C. 


7 


LAD31 


48 


LAD 5 


89 


N.C. 


130 


N.C. 


8 


W/R 


49 


LAD 6 


90 


N.C. 


131 


N.C. 


9 


LAD 29 


50 


LAD 4 


91 


N.C. 


132 


N.C. 


10 


LAD30 


' 51 


LADi 


92 


N.C. 


133 


N.C. 


11 


LAD 27 


52 


CLK 2 


93 


N.C. 


134 


N.C. 


12 


LAD 28 


53 


INT 2 


94 


N.C. 


135 


N.C. 


13 


ALE 


54 


LAD 3 


95 


N.C. 


136 


N.C. 


14 


LAD 26 


55 


LAD 2 


96 


N.C. 


137 


N.C. 


15 


ADS 


56 


LAD 


97 


N.C. 


138 


N.C. 


16 


HLDA 


57 


RESET 


98 


N.C. 


139 


N.C. 


17 


N.C. 


58 


INT3 


99 


N.C. 


140 


N.C. 


18 


v S s 


59 


INT! 


100 


Vcc 


141 


N.C. 


19 


v C c 


60 


Vss 


101 


N.C. 


142 


N.C. 


20 


v S s 


61 


Vcc 


102 


N.C. 


143 


N.C. 


21 


v C c 


62 


Vss 


103 


Vss 


144 


N.C. 


22 


Vcc 


63 


Vcc 


104 


N.C. 


145 


N.C. 


23 


Vss 


64 


Vss 


105 


N.C. 


146 


N.C. 


24 


Vcc 


65 


Vcc 


106 


N.C. 


147 


N.C. 


25 


Vss 


66 


Vss 


107 


N.C. 


148 


N.C. 


26 


Vcc 


67 


Vcc 


108 


N.C. 


149 


N.C. 


27 


HOLD 


68 


N.C. 


109 


N.C. 


150 


N.C. 


28 


BADAC 


69 


N.C. 


110 


N.C. 


151 


N.C. 


29 


LAD 25 


70 


N.C. 


111 


N.C. 


152 


N.C. 


30 


LAD 24 


71 


N.C. 


112 


N.C. 


153 


Vss 


31 


LAD 23 


72 


N.C. 


113 


N.C. 


154 


Vcc 


32 


LAD 21 


73 


N.C. 


114 


N.C. 


155 


N.C. 


33 


LAD 22 


74 


N.C. 


115 


N.C. 


156 


N.C. 


34 


LAD 19 


75 


INTq 


116 


N.C. 


157 


N.C. 


35 


LAD 20 


.76 


N.C. 


117 


N.C. 


158 


v ss 


36 


LAD 17 


77 


N.C. 


118 


N.C. 


159 


N.C. 


37 


LAD 18 


78 


N.C. 


119 


Vss 


160 


LOCK 


38 


LAD 16 


79 


N.C. 


120 


Vcc 


161 


FAIL 


39 


LAD 15 


80 


N.C. 


121 


N.C. 


162 


DEN 


40 


LAD 14 


81 


N.C. 


122 


N.C. 


163 


BE 2 


41 


LAD 13 


82 


N.C. 


123 


N.C. 


164 


Vss 



NOTE: 

Pins identified as N.C. ("No Connect") should never be connected under any circumstances. 
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Table 8. MQ80960MC (CQP) Pinout—ln Signal Order 


Signal 


Pin 


Signal 


Pin 


Signal 


Pin 


Signal 


Pin 


ADS 


15 


LAD 23 


31 


N.C. 


102 


N.C. 


148 


ALE 


13 


LAD 24 


30 


N.C. 


104 


N.C. 


149 


BADAC 


28 


LAD 25 


29 


N.C. 


105 


N.C. 


150 


BE^ 


1 


LAD 26 


14 


N.C. 


106 


N.C. 


151 


BET 


4 


LAD 27 


11 


N.C. 


107 


N.C. 


152 


BE 2 


163 


LAD 28 


12 


N.C. 


108 


N.C. 


155 


BE^ 


2 


LAD 29 


9 


N.C. 


109 


N.C. 


156 


CACHE 


5 


LAD30 


10 


N.C. 


110 


N.C. 


157 


CLK2 


52 


LAD31 


7 


N.C. 


111 


N.C. 


159 


DEN 


162 


LOCK 


160 


N.C. 


112 


READY 


3 


DT/R 


6 


N.C. 


17 


N.C. 


113 


RESET 


57 


FAILURE 


161 


N.C. 


68 


N.C. 


114 


v C c 


19 


HLDA/HOLDR 


16 


N.C. 


69 


N.C. 


115 


Vcc 


21 


HOLD/HLDAR 


27 


N.C. 


70 


N.C. 


116 


v C c 


22 


Iac/InTq 


75 


N.C. 


71 


N.C. 


117 


Vcc 


24 


INT! 


59 


N.C. 


72 


N.C. 


118 


Vcc 


26 


INT 2 /INTR 


53 


N.C. 


73 


N.C. 


121 


Vcc 


61 


INT3/INTA 


58 


N.C. 


74 


N.C. 


122 


Vcc 


63 


LAD 


56 


N.C. 


76 


N.C. 


123 


Vcc 


65 


LADi 


51 


N.C. 


11 


N.C. 


124 


Vcc 


67 


LAD 2 


55 


N.C. 


78 


N.C. 


127 


Vcc 


84 


LAD 3 


54 


N.C. 


79 


N.C. 


128 


Vcc 


100 


LAD 4 


50 


N.C. 


80 


N.C. 


129 


Vcc 


120 


LAD 5 


48 


N.C. 


81 


N.C. 


130 


Vcc 


126 


LAD 6 


49 


N.C. 


82 


N.C. 


131 


Vcc 


154 


LAD 7 


46 


N.C. 


83 


N.C. 


132 


v S s 


18 


LAD 8 


47 


N.C. 


85 


N.C. 


133 


v S s 


20 


LAD 9 


44 


N.C. 


86 


N.C. 


134 


v S s 


23 


LAD 10 


45 


N.C. 


88 


N.C. 


135 


Vss 


25 


LADii 


42 


N.C. 


89 


N.C. 


136 


Vss 


60 


LAD 12 


43 


N.C. 


90 


N.C. 


137 


Vss 


62 


LAD 13 


41 


N.C. 


91 


N.C. 


138 


Vss 


64 


LAD 14 


40 


N.C. 


92 


N.C. 


139 


Vss 


66 


LAD 15 


39 


N.C. 


93 


N.C. 


140 


v S s 


87 


LAD 16 


38 


N.C. 


94 


N.C. 


141 


v S s 


103 


LAD 17 


36 


N.C. 


95 


N.C. 


142 


v S s 


119 


LAD 18 


37 


N.C. 


96 


N.C. 


143 


Vss 


125 


LAD 19 


34 


N.C. 


97 


N.C. 


144 


Vss 


153 


LAD 20 


35 


N.C. 


98 


N.C. 


145 


Vss 


158 


LAD 21 


32 


N.C. 


99 


N.C. 


146 


Vss 


164 


LAD 22 


, 33 


N.C. 


101 


N.C. 


147 


W/R 


8 



NOTE: 

Pins identified as N.C. 



("No Connect") should never be connected under any circumstances. 
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Figure 21. A 132-Lead Pin-Grid Array (PGA) Used to Package the MG80960MC 
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• Low insertion force (LIF) soldertail 
55274-1 

• Amp tests indicate 50% reduction in 
insertion force compared to 
machined sockets 

Other socket options 

• Zero insertion force (ZIF) soldertail 
55583-1 

• Zero insertion force (ZIF) Burn-in 
version 55573-2 

Amp Incorporated 
(Harrisburg, PA 17105 U.S.A 
Phone 717-564-0100) 



AMP LIF SOCKET 
55274-1 




AMP ZIF SOCKET 
55583-1 



271080-13 



Cam handle locks in low profile position when MG80960MC is installed 
(handle UP for open and DOWN for closed positions). 

Courtesy Amp Incorporated 



Peel-A-Way* Mylar and Kapton 
Socket Terminal Carriers 

• Low insertion force surface 
mount CS132-37TG 

• Low insertion force soldertail 
CS132-01TG 

• Low insertion force wire-wrap 
CS132-02TG (two-level) 
CS132-03TG (thee-level) 

• Low insertion force press-fit 
CS132-05TG 

Advanced Interconnections 

(5 Division Street) 
Warwick, Rl 02818 U.S.A. 
Phone 401-885-0485) 



Peel-A-Way Carrier No. 132: 
Kapton Carrier is KS132 
Mylar Carrier is MS132 

Molded Plastic Body KS132 
is shown below: 



FOOT PRINT NO. 132 




14 x 14 x 3 ROWS 



271080-14 



SOLDER TAIL -01 LOW PROFILE -04 




WIRE WRAP -02/-03 SOLDER TAIL -33 SURFACE MOUNTING -37 





PRESS FIT -05 




Courtesy Advanced Interconnections 

(Peel-A-Way Terminal Carriers 

U.S. Patent No. 4442938) 



* Peel-A-Way is a trademark of Advanced Interconnections. 



Figure 22. Seyeral Socket Options for Mounting the MG80960MC 
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_ MEASURE CASE TEMPERATURE 
AT CENTER OF TOP SURFACE 




132 -PIN PGA 




Figure 24. System and Processor 
Clock Relationship 



Figure 23. Measuring MG80960MC PGA 
Case Temperature (Tc) 
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Figure 25. Read Transaction 
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Figure 26. Write Transaction with One Wait State 
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Figure 27. Burst Read Transaction 
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Figure 28. Burst Write Transaction with One Wait State 
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Figure 29. Interrupt Acknowledge Transaction 
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Figure 30. Bus Exchange Transaction (PBM = Primary Bus Master, SBM = Secondary Bus Master) 

Revision History 

1. 20 MHz timing specifications were added. 

2. Pin 1 58, ceramic quad pack, (see Figure 20) changed from NC (No Connect) to V$s- 
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M82965 
FAULT TOLERANT BUS EXTENSION UNIT 

Military 



Multiprocessor Support 

— Connect up to 32 Processor and 
Memory Modules in a Single System 

Multiple Bus Support with No External 
Logic 

— Connect up to Four 32-Bit Buses for 
High-Bandwidth Access to 
Interleaved Memory 

Software-Transparent Fault Tolerance 

— Recover from a Single-Point Failure 
in a Module or Bus without Affecting 
Program Execution 

Cache Control Support 

— Provides Directory, Coherency 
Logic, and Control Signals for a 
Two-Way Set-Associative Cache 

— Single BXU Supports 16 Kbytes 

— Combine up to Four BXUs to 
Support 64 Kbytes 



Message Passing 

— Supports Ihteragent Communication 

— Redundant Error Reporting Network 

Two I/O Prefetch Channels 

— Provides High-Bandwidth, Low 
Latency Access to Memory or I/O 
for Sequential Transfers 

Memory Module Support 

— Interfaces Discrete Memory 
Controller and DRAM Array to AP- 
Bus 

Advanced CHMOS III Technology 

Advanced Package Technology 
-—132 Lead Ceramic Pin Grid Array 
— 164 Lead Ceramic Quad Flatpack 

Military Temperature Range: 
-55°Cto + 125°C(T C ) 



The M82965 Bus Extension Unit (BXU) is the key to building multiprocessor and fault-tolerant systems with the 
80960MC 32-bit microprocessor. BXUs connect to each other in an expandable matrix that can support up to 
32 processor and memory modules in a single, high-performance system. No external interface logic is re- 
quired. The BXU increases overall system performance by providing hardware support for local caches, I/O 
prefetch, message passing, and multiprocessor arbitration. Through redundant modules, fault-tolerant systems 
based on the BXU can sustain a single-point failure and then reconfigure themselves automatically, while 
application programs continue undisrupted. Truly a VLSI building block, the M82965 BXU supports a wide 
range of fault tolerance and performance options to meet a diverse set of cost, performance, and reliability 
needs. 
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Figure 1. M82965 Block Diagram 



3-276 



January 1990 
Order Number: 271082-003 



M82965 



APmM©I OMIFtDIrSMATOM 



FUNCTIONAL OVERVIEW 

The M82965 Bus Extension Unit (BXU) is the key 
component in building multiprocessor and fault-toler- 
ant system designs with the 80960MC 32-bit micro- 
processor. Its primary function is to connect the Lo- 
cal bus (L-Bus) of a system module to a system-wide 
bus called the Advanced Processor Bus (AP-Bus), 
allowing the system to expand incrementally as 
each new module or AP-Bus is added. 

Several important features are provided within the 
BXU which streamline 80960MC multiprocessor sys- 
tem operation. To increase the available system bus 
bandwidth, multiple BXUs can be employed within 
each system module to support up to four AP-Buses. 
To reduce AP-Bus traffic, BXU components can di- 
rectly support a two-way set-associative cache. I/O 
prefetch channels are incorporated within each BXU 
to reduce the time necessary to transfer large blocks 
of data from shared system memory or I/O. BXUs 
support processor-to-processor communication by 
recognizing, storing, and exchanging Interagent 
Communication (IAC) messages with other BXUs 
along the AP-Bus. Requests for access to the AP- 
Bus are resolved through BXU arbitration logic 
which ensures that no system modules will suffer 
from resource starvation. 



basis through a method called Functional Redun- 
dancy Checking (FRC). Errors on the AP-Bus are 
detected through interlaced parity bits on the ad- 
dress/data and control lines, signal duplication on 
the transaction control lines, and a bus timer used to 
monitor the bus for non-response to a request. Re- 
covery mechanisms include the capability to marry 
FRC modules in a primary-shadow pair (Quad Modu- 
lar Redundancy), so that if either fails, the surviving 
spouse can take over operations immediately. Tran- 
sient errors on the AP-Bus are automatically retried, 
and in the case of permanent errors, the failed bus is 
disabled and all memory accesses switched to a 
backup bus. 



MULTIPROCESSOR SUPPORT 

A multiprocessor 80960MC system is composed of 
a set of modules connected to an AP-Bus. Figure 2 
shows the three possible types of modules: active, 
passive, and the combination of both an active and 
passive module. Active modules contain up to two 
80960MC processors, cache or private memory, and 
a BXU. Passive modules contain a memory array 
and controller and a BXU. Active/ Passive modules 
contain either processors and global memory, or 
master and slave I/O devices. 



^--:.V ; j,f 



BXUs support fault tolerant system operation 
through several mechanisms used to detect, isolate 
and recover from hardware errors. Paired BXUs 
monitor each other's operation on a cycle-by-cycle 
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Figure 2. Types of Modules 
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Local Bus 

In a multiprocessor system each module has its own 
Local Bus (L-Bus), which is typically confined to a 
single board. The L-Bus is provided to interconnect 
components within a module. It is a 32-bit multi- 
plexed, synchronous bus with a maximum bandwidth 
of 43 Mbytes per second at 16 MHz. It has been 
designed to interface with standard support compo- 
nents using minimal glue logic. The L-Bus uses 
HOLD/HOLDA for arbitration with bus sla ves and 
LOCK for signaling indivisible operations. A READY 
signal can be used to lengthen bus transactions. 



Local Bus protocol permits both primary and sec- 
ondary bus masters to coexist on the bus (often a 
processor and a DMA, or occasionally two proces- 
sors). A secondary bus master must obtain use of 
the L-Bus from the bus master through the use of 
HOLDR/HOLDAR. A BXU is always used as a mas- 
ter in a memory module and is generally used as a 
slave in a processor module. Fifty BXU pins are ded- 
icated to L-Bus and module support operations (in- 
cluding cache control). The L-Bus control registers 
are shown in Table 1 . 



Table 1. L-Bus Control Registers 



Register 


Description 


Physical-ID (Local) 


This register contains a unique identifier for a specific BXU on the L-Bus. It 
corresponds to the AP-Bus Physical-ID register. 


Logical-ID (Local) 


This register holds the Logical-ID of the BXU. It corresponds to the AP-Bus 
Logical-ID register. 


LBI Control 


This is the major control register for BXU functions on the L-Bus. It is used to 
set the interleaving factor for the cache, determines if the BXU should act as 
a master on the L-Bus, and indicates whether the BXU is in memory or 
processor mode. 


System Bus ID 


This register uniquely identifies the BXU as attached to one of four AP-Buses. 


Local-Bus Test 


This register allows system diagnostics to check on the type of recognition 
that was done on the previous L-Bus request. 


Match 


The contents of this register determine which bits in the L-Bus address should 
be recognized by the BXU. This register provides a base address for a 
partition of memory recognized by the BXU. 


Mask 


The contents of this register determine if certain bits in the Match register 
should be ignored (i.e., marked "don't care") during address recognition. 


Match 1 


Same function as Match Register 0. 


Mask 1 


Same function as Mask Register 0. 


Match 2 


Same function as Match Register 0. 


Mask 2 


Same function as Mask Register 0. 


Private Memory Match 


Private memory address recognizer. 


Private Memory Mask 


Private memory mask register. 
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Advanced Processor Bus 

A highly optimized multiprocessing bus called the 
Advanced Processor Bus (AP-Bus) interconnects 
80960MC system modules. The AP-Bus is synchro- 
nous, in that all components in the system, including 
processors and BXUs, are driven by the same clock 
edge. It is a 32-bit multiplexed bus with a maximum 
bandwidth of 43 Mbytes per second at 16 MHz. 

Transactions over the AP-Bus are encoded into 
pairs of request and reply packets. A request packet 
defines the operation, amount of data, and the loca- 
tion (or address) where the transaction will occur. In 
the case of a write request, the packet will also in- 
clude data. The reply packet indicates whether or 
not the action completed successfully, and in the 
case of read replies, will also include the requested 
data. Table 2 lists the various types of AP-Bus oper- 
ations. 

The AP-Bus supports a pipelining feature that allows 
up to three requests to be pending at any time. Re- 
ply packets are returned in the order requested un- 
less deferred, but requests and replies may be inter- 
mixed. For example, two requests may be made, fol- 
lowed by a single reply packet, then another request 
packet, before being completed by two reply pack- 
ets. 

The AP-Bus consists of 47 bi-directional signals, a 
clock signal, a RESET signal, and five module sup- 
port signals which are used to interface system mod- 
ules to the AP-Bus (see Figure 3). The BXU is the 
only component that attaches to the AP-Bus. 



BXUs connect to each other in the form of a matrix 
to allow orderly growth in the system by the addition 
of buses or modules. An 80960MC multiprocessing 
system allows up to 32 modules and four AP-Buses. 
In practice, the number of modules in a system will 
be somewhat less in order to meet the AP-Bus's 
timing and electrical specifications; a practical limit 
may be 20 to 25 connections to an AP-Bus. Table 3 
contains a summary of the functions of the AP-Bus 
Interface Registers. 

Table 2. Types of AP-Bus Operations 



Packet 
Type 


Base 
Action 


Specific 
Operation 


Request 


Write 


Write Word(s) 


RMW Write Word(s) 


Read 


Read Word(s) 


RMW Read Word(s) 


Reply 


Accepted 


Read Reply Word(s) 


Acknowledge 
(Write Reply) 


Refused 


Reissue 


Not Acknowledged 
(NACK) 


Bad Access 
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Transaction Control 

• Arbitration: ARB (3..0) 

• Reply Ordering: RPYDEF 

Packet Signals 

• Specification: SPEC (5..0) 

• Address/Data: AD (31.. 0) 

Error Signal Group 

• Check Signal: CHK(1..0) 

• Bus Error: (1..0) 






TRANSACTION CONTROL (5 LINES) 



PACKET SIGNALS (38 LINES) 



ERROR SIGNAL GROUP (4 LINES) 



SYNCHRONIZATION (2 LINES) 



MODULE SUPPORT (7 LINES) 






271082-3 



Synchronization and Initialization Group 

• System Clock: CLK2 

• Initialization: RESET 

Module Support Group 

• Identification: INITID 

• Module Check: MODCHK 

• Bus Output Control: BOUT 

• Communication: COM 

• Voltage Reference: Vref 

• Pop Queue: POPQUE 

• Subsystem Busy: SSBUSY 



Figure 3. Advanced Processor Bus 
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Table 3. AP-Bus Interface Registers 


Register 


Description 


Physical ID 


This register contains a unique identifier for a specific BXU (or FRC pair of 
BXUs) on an AP-Bus. 


Logical ID 


This register holds the logical ID for the BXU. In every case, all BXUs in the 
same module will share the same logical ID. When two modules are married 
in a QMR configuration, they will also share the same logical ID. 


Component 
Specifier 


The contents of this read-only register are fixed at manufacture and specify 
the type and stepping of the component. 


Arbitration ID 


When the BXU needs to issue a request on the AP-Bus, it must actively / 
arbitrate for the bus. The time and order in which a BXU arbitrates is 
determined by the contents of this write-only register. 


Com 


This register is used for loading external information, such as the type of 
board the BXU resides on, into the BXU. The register is useful for both 
initialization and diagnostics. 


AP-Bus Control 


This register is the general control and status register for the BXU's AP-Bus 
interface. 


FT1 


Most of the BXU fault-tolerant capabilities can be selectively enabled by 
altering control bits in this register. 


Maxtime 


The value in this register determines the length of time that BXUs will remain 
quiescent following the beginning of an error report. 


FRC Splitting 
Control 


Writing to this register allows a master/checker pair of BXUs to be split into 
separately functioning components. 


FRC Register 


The contents of this register determine of a BXU is part of a master/checker 
pair and how the component responds if it is part of a QMR module. 


Test Detection 


Bits in this register enable parity logic and other internal self testing diagnostic 
features. 


AP Match 


Bits in this register are compared against the corresponding bits in the AP- 
Bus address cycle and determine which partition of the address space is 
recognized by this BXU. 


AP Mask 


If a bit in this register is cleared, it will cause the corresponding bit position in 
the Address Match register to be ignored during comparisons. 




Memory addressing over the AP-Bus is divided into 
16-byte blocks. The location of a bus transaction is 
defined by a 32-bit address. Each address points to 
a single byte that is part of a larger 16-byte block. All 
transactions are performed on a single block or por- 
tion of a block, and do not overlap multiple blocks. 



Modes of Operation 

The BXU operates in either Processor or Memory 
mode. Processor mode provides support for Active 
or Active/ Passive modules, while Memory mode 
supports Passive modules. The functions of several 
BXU signals are dependent on the operating mode 
of the BXU. 



In Processor mode, the BXU supports cache, I/O 
prefetch and IAC message functions. The BXU can 
act as either a master or slave on the L-Bus and 
requests can flow in either direction between the 
AP-Bus and the L-Bus. The assumption is, however, 
that most traffic will flow from the L-Bus out onto the 
AP-Bus. In a processor-only module, there is no 
need for the BXU to participate in arbitration for the 
L-Bus, since it will operate only as a slave. 

In Memory mode, the BXU always operates as a 
master on the L-Bus and no requests are ever ac- 
cepted from the L-Bus. All requests flow from the 
AP-Bus into the module. In this mode, the BXU sup- 
ports memory functions and signaling, but does not 
provide caching or I/O prefetch. 
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Read-Modify-Write Transactions 

Read-Modify-Write (RMW) operations are provided 
to give BXUs the ability to read and modify a location 
as a single indivisible action. A RMW-Read opera- 
tion initiates the indivisible action by asserting the 
LOCK signal on the L-bus. A RMW-Write operation 
is used to terminate the action. » 

When an RMW-Read transaction occurs, the block 
of memory addressed is marked by the BXU control- 
ling that portion of memory as locked (the lock cov- 
ers a fixed address space based on address bits 4 
and 6). Once locked, any other RMW-Reads to this 
block will be rejected, but the block remains avail- 
able for other types of memory operations. 

When an RMW-Read is issued, the BXU controlling 
the affected memory will either respond with data in 
a normal Read Reply (and set the appropriate lock), 
or it will respond with a Reissue Reply indicating that 
the requested block is already locked. If refused, the 
requesting BXU will wait a short interval and then put 
the RMW-Read request back into the arbitration pro- 
cess and try again. 

RMW-Writes are equivalent to Write Word(s) except 
that it resets the lock for that memory location. The 
only valid reply packet is the Ack (Write Reply). 



Interagent Communications (I AC) 
Support 

Bus Extension Units and 80960MC processors com- 
municate by sending Interagent Communication 
(IAC) messages, which are a set of memory-mapped 
addresses recognized by all BXUs. These messages 
are used for such system functions as initialization, 
cache flushing, access to error logs and interrupts. 
The upper 16 Mbytes of the 80960MC's 4 Gigabyte 
address range are reserved for IAC communica- 
tions. 



IAC requests fall into two major groups: messages 
and register requests. Messages are sent between 
processors to cause a processor to perform a spe- 
cific action (e.g., Start, stop, flush cache, etc.) and 
are held in the IAC message support registers; Table 
4 summarizes the function of these four registers. 
Register requests are used by software to read and 
write to BXU registers in order to control the system 
operation or configuration. 

An IAC message always originates on an L-Bus and 
usually from a processor. From the originator, the 
request flows to the BXU where it may be handled 
internally or propagated on to the AP-Bus. If the IAC 
is sent on to the AP-Bus, the final destination of the 
IAC (another BXU) must reside on that bus. The IAC 
will not be propagated onto another L-Bus or AP- 
Bus. IAC messages can be one to four words long. 

Although each L-Bus (processor or memory module) 
may be connected to as many as four AP-Buses, at 
any point in time only one bus will be designated as 
the message bus. All IAC messages will flow over 
that bus. The BXUs on the message bus are respon- 
sible for handling the IAC message traffic on behalf 
of the processors residing on their L-Bus (an L-Bus 
may support one or two processors). 

AP-Bus normally serves as the message bus. If 
AP-Bus is not functional, then AP-Bus 1 serves as 
the message bus, completely transparent to the 
software. Processors are unaware of which bus is 
actually acting as the message bus. 



I/O Prefetch Support 

The BXU offers two I/O prefetch channels to pro- 
vide high bandwidth, low latency access to memory 
for sequential transfers. Each channel buffers 32 
bytes of data in two 16-byte blocks. As data is re- 
quested from the buffers, the BXU automatically pre- 
fetches the next data block. The BXU can take 



Table 4. IAC Support Registers 



Register 


Description 


Processor Priority 


This register holds the priority of the task (process) which Processor on the 
BXU's L-Bus is currently executing. 


Processor Message 


This register buffers four words of data from an IAC message for Processor 0. 


Processor 1 Priority 


This register holds the priority of the task (process) which Processor 1 on the 
BXU's L-Bus is currently executing. 


Processor 1 Message 


This register buffers four words of data from an IAC message for Processor 1 . 
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advantage of the three-deep AP-Bus pipeline to 
quickly fill the buffers if it ever gets behind because 
of momentary surges in AP-Bus traffic. In this way, 
the prefetch logic acts to provide stable, bounded 
response times, even in large multiprocessor config- 
urations. 

Because the normal operation of the BXU hides the 
latency of write requests by replying immediately on 
the L-Bus, the prefetch unit operates only for read 
requests. On a read request from the L-Bus, the pre- 
fetch logic returns the amount of data requested. 
Any processor or intelligent device used with the 
BXU must guarantee that it will split all memory re- 
quests that cross 16-byte boundaries into two re- 
quests. 



Cache Support 

The main function of a cache is to provide local high 
speed storage for frequently accessed memory lo- 
cations. Storing the information locally, the cache 
intercepts memory references and handles them di- 
rectly without transferring the request to the AP-Bus. 
This action results in lower traffic on the AP-Bus and 
decreased latency on the L-Bus, leading to im- 



proved performance for a processor on the L-Bus. It 
also increases potential system performance in a 
multiprocessor system by reducing each processor's 
demand for AP-Bus bandwidth, thereby allowing 
more processors in a system. 

The BXU provides cache directory, coherency logic, 
and control signals, while external SRAM is used for 
data storage. A CACHE signal output from the 
80960MC processor indicates to the BXU whether a 
request is cacheable. The operation of the BXU 
cache is not dependent on the size of the data trans- 
fer and therefore can support partial writes. Both 
data and instructions can be contained within the 
local cache. 

The BXU supports a two-way, set associative cache 
with 64 sets. The (read address) tag field is 20 bits 
long and consists of LAD lines 31-12. There are 
eight bits that indicate if a line is valid (a line is 16 
bytes). The control bits in the cache control registers 
can be used to mask some of these bits to change 
cache configurations. All entries in the directory can 
be invalidated by sending an INVALIDATE CACHE 
Command to each BXU in the module. Figure 4 
shows one example of a BXU cache directory and 
its relation to L-Bus addresses. 




AP-BUS ADDRESS 
LAD 31 -LAD 12 LAD 12 -LAD 7 LAD 6 -LAD 4 LAD 3 -LAD 



M 



STORED ADDRESSES 



STORED ADDRESSES 




CACHE ADDRESS 



Figure 4. Example of a Cache Directory Array 
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A single BXU supports 16 Kbytes of cache. When a 
processor module uses multiple BXUs (and there- 
fore multiple buses), the BXUs cooperate to provide 
a larger directory and addressing for a larger cache. 
The best way to view this larger directory is to think 
of it as having an increased number of sets. Thus a 
cache managed by two BXUs will have a directory 
consisting of 128 sets instead of 64. The maximum 
size cache is 64 Kbytes (four BXUs supporting four 
AP-Buses per processor module). 

The cache js managed using a write-through policy 
that guarantees that the shared system memory will 
always have the most recent copy of all data; BXU 
caches never contain the only copy of revised data. 
Any time a processor updates a cache entry, it al- 
ways causes a write request on the AP-Bus, so that 
there are never any hidden updates. In addition, all 
BXUs monitor AP-Bus traffic to detect if an update is 
being made to a location which they are storing in 



their own cache. If so, that line in the cache directory 
is marked invalid. This procedure guarantees that a 
BXU cache will always return correct data even 
when a system uses multiple caches, when multiple 
processors treat a single data item differently (some 
caching, some not), or when two processors are 
used on a single L-Bus. 

An example of an SRAM control design using a sin- 
gle BXU is shown in Figure 5. The BXU supplies six 
memory control signals to interface the directory and 
control logic with an externa] cache composed of 
static RAM: Cache Re ad (C R), Cache Write (CW), 
WayO ( WYO) , Way1 (WY1), WordO (WDO), and 
Wordt (WD1). SRAM control also requires use of 
the L-Bus byte enable (BE3-BE0) signals and cer- 
tain address lines. To simplif y latchin g the byte en- 
able signals, the BXU asserts READY on all address 
and recovery cycles as well as when it is transferring 
data. 
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Figure 5. Sample Cache SRAM Control Design Using a BXU 
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The tight timing specifications of SRAMs require a 
small amount of external logic to interface a static 
RAM cache to a BXU. Since all BXU cache signals 
have a relatively wide clock to data valid specifica- 
tion (T CC j), external flip-flops are used to achieve 
tighter resolution of the Cache Write and Word edg- 
es. The address bits are latched using ALE from the 
processor. WayO selects between the two "ways" in 
the cache directory, and Way1 selects between the 
cache and private memory (if present on the L-Bus). 

In order to ensure that the cache is filled properly, 
the byte enable latch is cleared on read requests. If 



the processor made a read request for two bytes 
that missed the cache, the BXU would first write the 
entire 16-byte block, then return the requested infor- 
mation to the processor. If the byte enable latches 
weren't set, then the write into the cache wouldn't 
work correctly because not all byte enables would 
be asserted. Byte enable information does not need 
to be held on reads because data is always returned 
in full words and the processor selects the portion of 
the word that it needs internally. Signal timings are 
shown in Figures 6-10. 
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Figure 6. Cache Read Signal Timing for 35 ns SRAMs 
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Figure 7. Cache Write Signal Timing for 35 ns SRAMs 
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Figure 8. Cache Read Signal Timing for 70 ns SRAMs 
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Figure 9. Cache Write Signal Timing for 70 ns SRAMs 
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The BXU has four memory address recognizers for 
the L-Bus plus an additional recognizer for initializa- 
tion RAM. Three of the memory address recognizers 
(Mask2-0 and Match2-0) map to shared system 
memory, while the fourth address recognizer maps 
requests to SRAM on the local bus, called private 
memory. The INIT-RAM recognizer serves two func- 
tions: it enables bootstrap software to use the 
SRAM cache as a scratch pad during system initiali- 
zation, and it provides the means for executing a 
memory test on the SRAM cache. The private mem- 
ory recognizer allows SRAM to be used on the local 
bus as normal memory in addition to a cache. Pri- 
vate memory is not accessable by other modules on 
the AP-Bus. 



Memory Module Support 

When operating in Memory mode, the BXU is a Lo- 
cal Bus master and only handles requests inbound 
from the AP-Bus. The cache control logic is disabled 
since it is unnecessary in a memory module. 

A read request received by an idle BXU will be seen 
on the L-Bus 1.5 clock cycles after it was received 
on the AP-Bus. BXUs offer two reply speed options 
for inbound Read requests. The high-performance 
option, called the "fast reply" mode, allows data to 
flow onto the AP-Bus with only a half-cycle delay 
through the BXU. This option requires the L-Bus 
memory controller to be able to supply data on every 
clock cycle. In the "slow reply" mode, the BXU buff- 
ers the entire AP-Bus reply packet before sending it 
onto the AP-Bus. This option permits the use of 
slower, less costly memory. 

Write requests are fully buffered before being 
passed to the L-Bus. Once the BXU has received an 
error-free packet, it initiates the L-Bus transaction. 
When the last data word has been accepted on the 
L-Bus, the BXU generates a reply on the AP-Bus. 

In memory mode, the BXU provides two or four 
Ready-Modify-Write locks with timeouts. Four locks 
are available if the module is not interleaved with 
other modules, two locks if it is interleaved. When 
interleaving occurs, address bit 4 is used as part of 
the address recognition for the module, which there- 
by restricts a module to use either locks and 2, or 
1 and 3. This approach ensures that if a bus switch 
occurs, the locks that may have been allocated on 
the failed bus will not overlap with locks that are 
currently allocated on the surviving bus (since all 
traffic is rerouted to the surviving bus). 



FAULT TOLERANCE 

Three basic tenets form the basis for the implemen- 
tation of 80960MC fault tolerant systems. First, 



fault tolerant functions are achieved through the rep- 
lication of VLSI components. Second, the system is 
partitioned into a set of confinement areas which 
form the basis of error detection and recovery. Third, 
only bus-oriented communication paths are used to 
provide system communication. 

The BXU is unique in that it provides all the functions 
necessary to detect, isolate, and recover from a fail- 
ure in any single system module or AP-Bus. Unlike 
many other fault tolerant system designs, 80960MC 
systems do not rely on voter components for fault 
detection, thereby eliminating one potential source 
of single-point failures. Although the BXU registers 
must be initialized by software, all the fault tolerant 
mechanisms are built into the hardware, and correct 
fault recovery of a system built using the BXU does 
not depend on software intervention. 

The purpose of a confinement area is to inhibit dam- 
age from error propagation and to isolate the faulty 
area for subsequent recovery and repair. A confine- 
ment area is defined as a unit (system module or 
AP-Bus) that has a limited number of tightly con- 
trolled interfaces. Figure 1 1 shows the confinement 
areas within a small system. Detection mechanisms 
exist at every interface to ensure that no inconsist- 
ent data can leave the confinement area and corrupt 
other confinement areas. When a fault occurs in the 
system, it is immediately isolated to a confinement 
area. The fault is known to be in that confinement 
area, and all other confinement areas are known to 
be fault-free. All intermodule communication in an 
80960MC system occurs over buses. There are no 
point-to-point or daisy-chained signals. 

This arrangement makes modular growth and on- 
line repair possible since no signal definition is de- 
pendent on the number of resources in the system. 
The presence or absence of any module cannot pre- 
vent communication between any other modules. 
The AP-Bus provides a uniform communications 
matrix that allows multiprocessor and fault-tolerant 
systems to expand modularly. 

In 80960MC systems, there are three distinct steps 
in responding to an error. First, the error is detected 
and isolated to a confinement area. Next, the error is 
reported to all the modules in the system. This ac- 
tion prevents the incorrect data from propagating 
into another confinement area and provides all the 
modules with the information required to perform re- 
covery. Finally, the faulty confinement area is isolat- 
ed from the system. Recovery occurs through the 
application of redundant resources available in the 
system. Table 5 describes the fault-tolerant control 
registers. 
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Figure 1 1. Fault Confinement Areas in an 80960MC System 
Table 5. Fault Tolerance Support Registers and Commands 



Register 


Description 


Test Type 


The Test Report command instructs the BXU to test the error reporting 
network. The type of error report generated is determined by the content of 
this register. 


Spouse ID 


In a QMR module, this register holds the module ID of the FRC module to 
which this module is married. 


QMR 


The contents of this register determine if a module is part of a QMR pair, and 
if it should function as the primary or shadow in the pair. 


Module Error ID 


Identifies the BXU as part of a specific module confinement area. 


Bus Error ID 


Determines the Bus ID contents in an error report. 


Error Log 


Records the type of the most recent error report received and the number of 
errors that have occurred since the last Terminate Permanant Error Window 
command. 


Error Record 


Holds the contents of the previous error report. 


FT2 


Holds additional fault-tolerant control parameters. 


Test Report Command 


The Test Report command instructs the BXU to test the error reporting 
network. The type of error report generated is determined by the contents of 
the Test Type Register. 


Primary Catastrophe 
Command 


A write to this register causes a Primary Catastrophe error report, usually 
indicating a primary module power failure. 


Shadow Catastrophe 
Command 


A write to this register causes a Shadow Catastrophe error report, usually 
indicating a shadow module power failure. 


Terminate Permanent 
Error Window 
Command 


A write to this register closes the permanent error window, so that a 
reoccurance of a previous error is not recorded as permanent. 


Attach Bus Command 


A write to this register causes the identified bus to be attached to the system 
and become active. 


Detach Bus Command 


A write to this register causes the identified bus to be detached from the 
system and become inactive. 


Sync Refresh Command 


A write to this register causes BXUs in memory mode to assert their ForceRef 
pin and enables AP-Bus address matching. 
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Functional Redundancy Checking 

BXU components can be paired together to com- 
pare their outputs to ensure that they agree. This 
detection mechanism is called Functional Redun- 
dancy Checking (FRC) because identical compo- 
nents are used to check operations. 

At initialization time, one component in the BXU pair 
is selected to be the "Master", while the other is 
designated the "Checker". The Master BXU is re- 
sponsible for carrying out the normal operation of 
the system and behaves as it would if it were operat- 
ing in a non-fault tolerant system. The Checker BXU, 
in contrast, disables its AP-Bus outputs and instead 
monitors the AP-Bus pins of the Master (see Figure 
12). The Checker BXU is responsible for duplicating 
the operation of the Master and using its internal 
comparison circuitry to detect any inconsistency be- 
tween its result and the output of the Master. 

The Master and Checker BXUs run in lock step, 
comparing operations cycle-by-cycle. If at any point 
the Master or Checker disagree, an FRC error will be 
signaled and an error reporting cycle will begin. 

When using the FRC mechanism, the BXU pins 
comprising the electrical connection to the AP-Bus 
must be connected together. A BXU provides FRC 
coverage on the AD, SPEC, BOUT and MODCHK 
pins. 



Failures in the Checker's AP-Bus drivers can be de- 
tected by reversing the role of the Master and 
Checker BXU. When Master/Checker Toggling is 
enabled, the Roles of the Master and Checker are 
switched after each bus cycle. 



Parity, Duplication and Timeouts 

In order to prevent incorrect AP-Bus operation for 
passing corrupted data to the BXU (and onto the 
Local Bus), the BXU uses parity, signal duplication, 
and bus timeouts to check for errors. Specifically, 
the AP-Bus has interlaced parity bits covering the 
AD and SPEC signals, signal duplication is used on 
both arbitration and RPYDEF, and a bus timer is set 
to monitor the bus for non-response to a request. 

The BXU calculates two separate parity bits across 
alternate AD and SPEC signals, which are indicated 
by the CHKO and CHK1 pins. CHKO is even parity 
across the even AD and SPEC pins, and CHK1 is 
even parity across the odd pins. Since the arbitration 
and RPYDEF lines are driven independently by mul- 
tiple bus agents (BXUs), parity cannot be used for 
error detection, rather the detection of errors is done 
by duplicating each set of lines, one set for Masters, 
the other set for Checkers. Consequently, each BXU 
connects to only one arbitration network. If there is a 
disagreement between the two sets of signals on 
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Figure 12. Functional Redundancy Checking (FRC) 
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the AP-Bus, it will be detected through an FRC dis- 
agreement. The BXU uses a timer to determine if no 
response has been received and too long a period 
has elapsed since the bus request was made. Dur- 
ing normal operation the timer is active whenever 
the bus pipeline is not empty. The timer is reset on 
every bus reply or deferral. If the BXU was the 
source of the requests and a timeout occurs, it sig- 
nals a Bad Access Reply on the AP-Bus. The timer 
is nominally 64 clocks. 



Error Reporting 

The error reporting network is the backbone of fault 
isolation and recovery. When an error is detected, 
the BXU detecting the error reports its type and lo- 
cation to all other nodes in the system. The error 
reporting network is designed so that, independent 
of an error in the system, each node not only re- 



ceives an error report, but is guaranteed to receive 
the same error report. Each BXU in the system uni- 
formly logs each error report, and is able to use this 
information to proceed independently with the ap- 
propriate recovery procedure. 

The BXU has two serial Error Reporting Lines asso- 
ciated with each bus interface (BERLs for the AP- 
Bus and LERLs for the Local Bus). An indentical se- 
rial error report is sent over each pair of lines associ- 
ated with each bus. 

An AP-Bus error reporting cycle consists of five 
phases: Reporting, Partner Communications, Tran- 
sient Waiting Period, Retry, and the Permanent Error 
Window (see Figure 13). The reporting phase lasts 
256 cycles from the beginning of the first report re- 
ceived on the BXU's error reporting lines. The BXU 
becomes quiescent as soon as it detects the start bit 
of an error report and remains quiescent through the 
Transient Waiting Period. 



NO TERMINATE- 
PERMANENT-ERROR- 
WINDOW COMMAND 




Figure 13. Error Reporting Cycle 
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During partner communications, BXUs communicate 
with each other via their POPQUE lines to determine 
whether to retry accesses in the case that one of the 
AP-Buses is removed from the system. Partner or- 
dering lasts 256 cycles. 

Transient waiting enables the system to sustain dis- 
turbances from mechanical vibrations and brief elec- 
trical transients without needing to permanently re- 
configure the system. The BXUs simply wait a pre- 
determined time for the transient to subside. The du- 
ration of the Transient Waiting Period is adjustable 
and can be set by software (16 jlls to 500 ms at 
16 MHz). During this period, the BXU completes its 
internal recovery mechanisms (if the error is perma- 
nent). Since the transient waiting mechanism on the 
buses depends on all buses moving to the retry 
state at the same time, all BXUs must have identical 
values for the Transient Waiting Period. 



During the RETRY phase, all accesses that were 
pending at the time that the error report was re- 
ceived will be retried. At the same time as RETRY 
begins, the BXU enters the Permanent Error Win- 
dow. During this interval, the BXU watches for the 
error to reoccur. 

Each BXU has two registers that are used for log- 
ging error reports. The ERROR LOG register con- 
tains the current error report and the ERROR REC- 
ORD register contains the previous error report. 
When a error report is received, the contents of the 
ERROR LOG register are copied into the ERROR 
RECORD register. Both registers are accessible by 
software and are the primary means by which the 
software routines responsible for system manage- 
ment communicate with the hardware fault handling 
mechanisms. Table 6 lists the types of errors that 
can be reported. 



Table 6. Error Types Reported 



Error Type 


Description 


Unsafe Confinement 
Area 


This type of report is issued when an error is detected that would make a retry 
dangerous. 


Primary Catastophe 


Generated in response to a Primary Catastrophe Command from software. 
The command is usually issued when all primary modules are about to fail 
because of a loss of power. 


Shadow Catastophe 


Generated in response to a Shadow Catastrophe Command from software. 
The command is usually issued when all shadow modules are about to fail 
because of a loss of power. 


Error Reporting Error 


The report indicates that a BXU has detected a failure on one of its error 
reporting lines. 


Bus Arbitration 




This report is issued when an FRC error is detected on the BOUT pin of the 
BXU indicating a bus arbitration error. 


Bus Parity 


Indicates that a parity error has been detected on the AP-Bus. 


Component 


Indicates that a checker has detected an FRC error while its master was 
driving the AP-Bus. 


Uncorrectable Array 
Error 


An uncorrectable error has been detected in one of the memory arrays. 


Correctable ECC 


A correctable error has been detected in one of the memory arrays. 


COM Altered 


This error report occurs when the COM input is toggled (two cycles high, 
followed by two cycles low) and may be used by external circuits to notify the 
system of an external fault. 


Attach Bus 


Issued in response to an Attach Bus command, this report is used to 
reactivate a bus that was previously out of service. 


Detach Bus 


Issued in response to a Detach Bus command, this report is used to remove a 
faulty bus from the system. 


Terminate Permanent 
Error Window 


Receiving this report signifies the end of the Permanent Error Window. 


Sync Refresh 


Used to synchronize memory modules that are being married to form a 
Primary/Shadow Pair. 




3-293 



iny* 



M82965 



A©m(l©I OM(F©[!3»MirO©M 



The BXU's hardware compares the contents of the 
two error reporting registers to determine if a bus 
retry has resulted in a repeat of the previous error 
(which therefore must be considered a permanent 
error). Software can clear the two registers by send- 
ing a Terminate Permanent Error Window command. 
The registers allow software to monitor the health of 
the system and to respond appropriately in case of 
hardware problems. The availability of this informa- 
tion simplifies diagnostic routines. 

The ERROR LOG register is handled independently 
by hardware and software; hardware always re- 
sponds immediately to an error report so that it is 
never lost by failure of software to respond. During 
normal system operation, software should never 
write to this register, since it is both read and written 
by hardware. The ERROR LOG register is cleared 
on a cold start, but its contents are retained across a 
warm start. 



RECOVERY MECHANISMS 



Module Shadowing 

Automatic recovery from permanent single-point fail- 
ures in a module is accomplished through module 
shadowing, or what is more formally called Quad 
Modular Redundancy (QMR). Using this technique, 
two FRC pairs (master/checker) of the same type 
are logically linked to form a primary/shadow pair 
(see Figure 14). The marriage of the two modules is 
performed by software which sets the logical ID of 
the two modules equal and restarts them in lock 
step (or synchronous operation). There is no direct 
electrical connection between a primary/shadow 
pair. They are usually on separate boards so that 
either can be removed in the case of a failure in that 
module. 
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Figure 14. In Quad Modular Redundancy (QMR), Self-Checking Modules are Paired 
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The primary/shadow pair operate in lock step so 
that there is always a complete and current backup 
for an FRC pair. At any point in time, one FRC pair 
will be active (i.e., sending its output to the AP-Bus) 
while the other will be passive (i.e., its outputs will be 
disabled). Initially, the primary FRC pair is active and 
is responsible for issuing requests or replies to the 
AP-Bus. Data leaves only by means of the active 
FRC pair. 

As an option, the roles of active and passive mod- 
ules are switched after every second bus cycle. (In 
contrast, master/checker pairs are toggled every cy- 
cle). This ping-pong action exercises all of the logic 
in both primary and shadow modules. Any latent fail- 
ure that exists in the AP-Bus drivers will be detected 
immediately. All of the logic to perform this lock step 
operation is contained in the BXU and neither the 
processors nor any discrete logic contained in a 
module is aware that the module is participating as 
one-half of a primary/shadow pair. 

Each physical FRC pair (primary and shadow) re- 
mains a self-checking pair. Whether in an active or 
passive module, all detection mechanisms remain 



enabled and continuously check the operation of 
that module. Neither the primary nor the shadow 
check the operation of the other; FRC is used for 
fault detection, while module shadowing (Quad Mod- 
ular Redundancy) is used to ensure immediate re- 
covery. 



Automatic Module Recovery 

If a permanent error is detected in either a primary or 
a shadow FRC pair, the faulty pair will immediately 
be disabled as all BXUs in the pair shutdown. The 
surviving spouse then separates itself from the faulty 
FRC pair and operates as an active pair on every 
bus cycle. At that point, recovery is complete. 

Hardware recovery is autonomous and requires no 
software intervention to complete. The operating 
system can be informed that a hardware reconfigu- 
ration has taken place by tying an error report line to 
one of the processor's interrupt pins. Then when a 
fault occurs, a processor can examine the error re- 
port log to discover what has happened and then re- 
examine the system configuration. Figure 1 5 shows 
an example of module recovery. 
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Figure 15. Faulty Modules are Automatically Disabled 
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Bus Switching 

All AP-Buses in an 80960MC system are physically 
identical, but when a system is operational each bus 
handles a unique address range. The BXU has been 
designed so that it is possible to pair together two 
AP-Busses and have them act as redundant or alter- 
nate resources for each other. AP-Bus is paired 
with AP-Bus 1 and AP-Bus 2 is paired with AP-Bus 3. 
In order for an FRC pair to have an additional bus, it 
must also have another pair of Master/Checker 
BXUs. Normally the memory addresses will be inter- 
leaved between the two (or four) buses, but this isn't 
necessary for bus switching. 

Since the AP-Bus does not hold state information 
(as do processors and memory), all buses in the sys- 



tem may be used during normal operation. There is 
no degradation of throughput to achieve bus redun- 
dancy. Each bus is fully operational. 

When a permanent error has been detected on an 
AP-Bus, all BXUs on the faulty bus disable them- 
selves. L-Bus requests for the failed bus will be ig- 
nored by the disabled BXUs and picked up instead 
by the BXUs attached to the backup bus. If a BXU 
has a cache, the BXU invalidates its cache directory 
since the directory must be reorganized to match the 
new (and larger) address space, including a new in- 
terleaving factor. Figure 16 shows an example of 
bus switching. 
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Legend: 

C = CPU 

B = BXU 

M = Memory Array 

Hardware automatically reconfigures to bypass the faulty bus (AP-Buso). 

AP-Busi takes over the address space of AP-Busq. — 



Figure 16. If a Bus Fails, Its Backup Bus Takes Over Immediately 
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Self-Healing Systems 

In some applications it is important to guarantee the 
integrity of the data, but momentary interruptions in 
processing can occur without seriously affecting op- 
erations or jeopardizing human lives. For these ap- 
plications, a cost effective approach may be to use 
self-healing systems. 

Self-healing systems use Functional Redundancy 
Checking to ensure that all errors are detected and 
that faults are confined within a module. Fault recov- 
ery is not automatic; recovery and reconfiguration is 
done by software following error detection. Self- 
healing systems are less costly than fully fault-toler- 
ant systems because fewer components are neces- 
sary. 

Self-healing systems do not operate continuously in 
the case of a hardware failure. Program execution 
cannot proceed after detection of a permanent error 
until the system has been reconfigured. Transient 
errors will still be taken care of by the hardware 
components. Upon detection of a permanent error, 
the system will cease operation, however FRC en- 
sures that no data will have been corrupted. 

After the system stops, it must be reset and a diag- 
nostic program run which reads the BXU errors logs 
and determines the most appropriate action to take. 
Recovery and reconfiguration may be complete and 
the system back on-line within a few seconds to sev- 
eral minutes, depending on the nature of the fault. 

Self-healing systems are not appropriate for real- 
time applications where program delays longer than 
a few milliseconds cannot be tolerated. In these crit- 
ical applications, an interruption in system operation 
might result in damage to expensive material and 
equipment, or endangerment of human lives. The 
80960MC system fault tolerant architecture provides 
the means for building systems that will recover au- 
tomatically within 48 /lis. 



BXU Registers 

Initialization and control of the BXU is done by read- 
ing and writing the BXU's internal registers. The reg- 
isters are mapped to the upper 16 Mbytes of the 
80960MC processor's physical address space. 

Initialization of a system using BXUs occurs in three 
stages. In the first stage which immediately follows 
RESET, all registers (except for the registers con- 
taining error report information) are loaded with or 
with values sampled off a set of pins. 

During this stage the BXU's System Bus ID and 
mode of operation are established. In the second 
stage, software assigns logical, physical, and arbitra- 
tion IDs to each BXU. Then in the third stage, the 
COM pin can be used to load board-specific infor- 
mation into the BXU and software can change the 
default values of any of the registers. 

Once software has established the initial configura- 
tion of the system, no further interaction between 
the system software and the BXU may be necessary 
except for testing the error reporting functions and 
for making on-line changes to the system's initial 
configuration. 

This Advance Information Data Sheet contains a 
functional description for each of the BXU's major 
register groups. For more specific details on control- 
ling each of the registers, please consult the 
80960MC Hardware Designer's Reference Manual. 



SIGNAL DESCRIPTIONS 

Tables 7 through 1 1 describe the function of each of 
the BXU signals. Many of the pins are multiplexed 
and have different interpretations depending on 
whether the BXU is in Processor or Memory mode. 
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Table 7. M82965 BXU L-Bus Signals 



Symbol 


Type 


Name and Function 


LAD 31 
-LAD 


I/O 
T.S. 


LOCAL ADDRESS/DATA BUS: Carries 32-bit physical addresses and data to and from a 
processor or memory. During an address (T a ) cycle, bits 2-31 contain a physical word 
addres (bits 0-1 indicate SIZE; see below). During a data (Td) cycle, bits 0-31 contain 
read or write data. The LAD lines are active HIGH and float to a three-state OFF when the 
bus is not acquired. 

SIZE: Which is comprised of bits 0-1 of the LAD bus during a T a cycle, specifies the size 
of a transfer in words. 
LADi LAD 
1 Word 

1 2 Words 

1 3 Words 
1 1 4 Words 


ALE 



T.S. 


ADDRESS-LATCH ENABLE: Indicates the transfer of a physical address. ALE is 
asserted during a T a cycle,and deasserted during Td cycles and the second half of T a 
cycles. It is active LOW and floats to a three-state OFF when the L-Bus is not acquired. 


ADS 


I/O 
O.D. 


ADDRESS STATUS: Is used to detect address cycles and additional data cycles. 


CACHE 


I 


CACHE ABLE: During a T a cycle, specifies whether data is cacheable. When operating 
in the MEMORY mode this pin should be tied to ground through a 10 kft resistor. 


W/R 


I/O 
O.D. 


WRITE/READ: specifies, during a T a cycle, whether the operation is a write or read. It is 
latched on-chip and remains valid during Td cycles. 


CW/DEN 



O.D. 


CACHE WRITE: (Defined only when the BXU is in PROCESSOR mode). This signal 
indicates that the cache SRAM should be written with data from the L-Bus and is used to 
generate the chip select, and write enable signals required by the SRAM. The signal is 
open drain so it can be shared among multiple BXUs controlling a single set of SRAMs. 
DATA ENABLE: (Defined only when the BXU is in MEMORY mode). Is asserted during 
Tp cycles and indicates transfer of data on the local AD bus lines. 


CR/ 
DT/R 



O.D. 


CACHE READ: (Defined only when the BXU is in PROCESSOR mode). This signal 
indicates that the cache SRAM should drive data onto the L-Bus in response to a read 
request and is used to generate the chip select and output enable signals required by the 
SRAM. This signal is open drain so it can be shared among multiple BXUs controlling a 
single cache. 

DATA TRANSMIT/RECEIVE: (Defined only when the BXU is in MEMORY mode). 
Indicates the direction of data transfer. It is low during T a and Td cycles for a read or 
interrupt acknowledgement; it is high during T a and Td cycles for a write. DT/R never 
changes state when DEN is asserted. 


LOCK 


I 


BUS LOCK: Is used by the BXU to distinguish between normal reads and RMW-reads, 
normal writes and RMW-writes. 


An 80960MC processor asserts LOCK at the beginning of an RMW cycle, and the BXU 
recognizes it as an RMW-read. If the read operation is accepted by the module serving 
memory, the processor drops LOCK, and executes an RMW-write. LOCK is also held 
asserted during an interrupt-acknowledge transaction. 




I/O 
O.D 




READY 


READY: Indicates that data on LAD lines can be sampled or removed. If READY is not 
asserted during a Td cycle, the Td cycle is extended to the next cycle, and ADS is not 
asserted in the next cycle. READY is driven on T a , T r and T-, cycles. 



NOTES: 

I/O = Input/Output, I 



Input, O = Output, O.D. = Open Drain, T.S. = three-state 
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Table 7. M82965 BXU L-Bus Signals (Continued) 



Symbol 


Type 


Name and Function 


BE 3 _ 
BE 


I/O 
O.D. 


BYTE ENABLES: S_pecify which data bytes on the local buswill take part in 
the next bus cycle. BE3 corresponds to LAD24-LAD31 and BEq corresponds 
toLAD -LAD 7 . 


HOLD/ 
HOLDAR 


1 


HOLD: Indicates that a master I/O peripheral requests control of the bus. 
When the BXU receives HOLD and grants the peripheral control of the bus, it 
floats the bus lines and then asserts HLDA and enters the Th state. When 
HOLD is deasserted, the BXU will deassert HLDA and go to either the Tj or T a 
state. 


HOLD ACKNOWLEDGE REQUEST: Is an input to the secondary bus master 
that the primary bus master has relinquished control of the bus. 


HLDA/ 
HOLDR 





HOLD ACKNOWLEDGE: Relinquishes control of the bus to a master I/O 

peripheral. 

HOLD REQUST: Is used by a Secondary Bus Master to request use of the 

bus from the Primary Bus Master. 



Table 8. M82965 BXU L-Bus Module Support Signals 



Symbol 


Type 


Name and Function 





O.D. 




BADAC 


BAD ACCESS: If asserted in the cycle following the one in which the last READY of a 
transaction is asserted as a result of a bad access, it indicates that the transaction has 
exceeded the AP-Bus time-out period. 


JACrj/ERR 


I/O 
O.D. 


INTERAGENT COMMUNICATION: PROCESSOR 0: (Defined only when the BXU is in 
PROCESSOR mode). Is an open-drain output that indicates that there is a pending IAC 
message for Processor on the BXU's local bus. 

EXTERNAL ERROR: (Defined only when the BXU is in MEMORY mode). Is an input that 
indicates that an error has been detected in external logic (e.g., a failure in a discrete 
memory controller). 


lAC^FRF 



O.D. 


INTERAGENT COMMUNICATION: PROCESSOR 1: (Defined only when the BXU is in 

PROCESSOR mode). Is an open-drain output that indicates that there is a pending IAC 

message for Processor 1 on the BXU's local bus. 

FORCE REFRESH: (Defined only when the BXU is in memory mode). Is an open-drain 

output that tells the external memory controller to immediately execute a refresh 

operation. 




I 


PREFETCH: Is used in conjunction with the Cache and Write/Read (W/R) signals to 
define the type of request being issued (0 = LO, 1 = HI): 
PFETCH CACHE W/R 

Read using Prefetch Channel 

1 Start for Prefetch Channel 

1 Read using Prefetch Channel 1 

1 1 Start for Prefetch Channel 1 

1 Noncacheable Read 
1 1 Noncacheable Write 
1 1 Cacheable Read 

1 1 1 Cacheable Write 


PFETCH 



1m 



NOTES: 

I/O = Input/Output, I 



Input, O = Output, O.D. = Open Drain 
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Table 9. M82965 BXU AP-Bus Signals 



Symbol 


Type 


Name and Function 


AD31-AD0 


I/O 
O.D. 


SYSTEM ADDRESS/DATA LINES: Carry 32-bit addresses and data 
between modules (BXUs) on an AP-Bus. The content of the AD lines is 
defined by the SPEC encoding during the same bus cycle. 


SPEC5-SPEC0 


I/O 
O.D. 


PACKET SPECIFICATION: Signals define the packet type and the 

parameters required for the transaction: 

SPEC5: REQUEST: Is asserted if the packet is a request packet. 

SPEC 4 : MULTICYCLE: Is asserted if the packet consists of more than 

one bus cycle. 

SPEC3-SPEC2: CYCLE COUNT: These two bits are used in 

conjunction with Request and Multicycle signals to specify the length 

of the packet (in bus cycles) and the data length (in words). 

SPEC^SPECo: OPERATION/STATUS TYPE: These two bits identify 

the specific operation or status conveyed by the packet. 


CHK^CHKo 


I/O 
O.D. 


CHECK SIGNALS: Provide interlaced parity for the SPEC and AD 
lines. 


ARB3-ARB0 


I/O 
O.D. 


ARBITRATION: Signals are used by the bus agents (BXUs) to 
determine which agent has access to the bus next. These signals have 
a timing that is one-half cycle out of phase with the AD lines. 


RPYDEF 


. I/O 
O.D. 


REPLY DEFER: Signal allows an agent to give up its "slot" on the bus 
temporarily if its access is going to take a long time. This action 
reorders the pipeline, moving the deferred request to the bottom of the 
queue, resets the bus time-out counter and permits another agent to 
use the bus. 


BERL r -BERL 


I/O 
O.D. 


BUS ERROR REPORT LINES: Is used to signal errors from bus 
transactions or from within modules connected to the bus. 



NOTES: 

I/O = Input/Output, I 



Input, O = Output, O.D. = Open Drain 
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Table 10. M82965 BXU AP-Bus (Local Agent) Support Signals 



Symbol 


Type 


Name and Function 


CLK2 


I 


SYSTEM CLOCK: Provides the base timing and synchronization for all agents 
(BXUs) in the system. It is sourced to all agents from a central clock and is 
twice the frequency of the bus cycle. 

NOTE: 
The clock skew over the AP-Bus for a typical system should be no greater 
than 6 ns for correct system operation. 


BOUT 


I/O 
O.D. 


BUS OUTPUT CONTROL: Is asserted whenever a component is driving the 
AP-Bus. Functional Redundancy Checks on BOUT can be used to detect 
arbitration failures. 


MODCHK 


I/O 
O.D. 


MODULE CHECK: Is connected between Master/Checker pairs, allowing a 
Functional Redundancy Check to be performed on internal states. 


INITID 


I 


INITIALIZE ID: Is connected to one of the 32 AD lines and is used in 
conjunction with the IDENTIFY DEVICE IAC to provide a unique address for 
each BXU at initialization time. 


Vref 


I 


VOLTAGE REFERENCE: Provides a stable voltage reference for the input 
buffers of components connected to the AP-Bus. External hardware must 
provide a Vref/W voltage (see Table 1 4) on the Vref P in during normal 
operation of the component. The Vref P in ls also used to distinguish between 
a warm start (system memory and the Error Record register retain their state) 
and a cold start (system memory and BXU registers are cleared). 


RESET 


I 


RESET: Forces all agents on the bus to reset and synchronize. The bus cycle 
begins the first CLK2 period after RESET is deasserted. The RESET signal is 
the way a BXU is synchronized to the rest of the system. 


COM 


I/O 
O.D. 


COMMUNICATION: Can be used to load information into a component as 
part of the initialization sequence or to inform external logic that the 
component has failed. The BXU will asserted COM if it has shut itself off due 
to a failure in its module. 

The COM signal is not involved in any aspect of AP-Bus operation, but can be 
used to load board-dependent information into the BXU or to signal the rest of 
the system that an external error has occurred. 




NOTES: 

I/O = Input/Output, I 



Input, O = Output, O.D. = Open Drain 
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Table 11. M82965 BXU Module Support Signals 



Symbol 


Type 


Name and Function 


WYq/COR 



O.D. 


WAYq: (When the BXU is in processor mode). Indicates which one of 
the two "ways" in a directory set had a cache hit. The line is intended 
to drive the SRAM address pins and will remain stable throughout the 
length of a cache access. 

CORRECT: (When the BXU is in memory mode). Is used by the BXU to 
tell an external ECC controller to correct the memory data as it flows 
onto the local bus. If this signal is not asserted, then the memory data 
may flow directly onto the local bus with only error checking, but no 
correction. 


WyT/mem 



O.D. 


WAY-i: (Defined only when the BXU is in PROCESSOR mode). 
Indicates if the access is for the cache or private memory half of the 
SRAM. The line is intended to drive the SRAM address lines directly 
and will remain stable throughout the length of a cache access. 
MEMORY/REGISTER REQUEST: (Defined only when the BXU is in 
MEMORY mode). This signal allows mapping some of the BXU's 
register space out to the registers in an external controller. If the signal 
is high, the associated L-Bus request is a memory request; otherwise, 
the L-Bus request is to an external register on the board. 


Wd^/Dnc 


I/O 
O.D. 


WORD : (Only defined when the BXU is in PROCESSOR mode). 
Provides the low order bit of the word address for the SRAM. Together 
with WORD-i, the two bits indicate which of the four words within an 
address line should be addressed. Because SRAM timing is critical, an 
external latch could be required. The signals change for each word of 
data transferred. 

UNCORRECTABLE ECC: (Only defined when the BXU is in MEMORY 
mode). Is an input used by the external ECC logic to signal to the BXU 
that it has detected an uncorrectable memory error. 


Wd^/ecc 


I/O 
O.D. 


WORD^ (Defined only when the BXU is in PROCESSOR mode). 
Provides the high order bit of the word address for the SRAM. 
Together with WORDq, the two bits indicate which of the four words 
within an address line should be addressed. Because SRAM timing is 
critical, an external latch will be required. The signals change for each 
word of data transferred. 

ECC ERROR: (Defined only when the BXU is in MEMORY mode). Is 
an input used by the external ECC logic to signal to the BXU that it has 
detected a memory error. The signal will be asserted even though 
external logic may be correcting the error and providing correct data 
on the L-Bus. If the BXU is asserting its CORRECT signal, the ECC 
ERROR signal will be ignored. Only the UNC pin will be checked for an 
error indication under these conditions. 




I/O 
O.D. 


SUBSYSTEM BUSY: Connects together all BXUs in a module that are 
in the same subsystem. When the signal is pulled low (BUSY), the 
BXUs will accept a request address, but will not continue with the data 
cycles. This signal is used to ensure that the BXUs always handle 
RMW-writes, Interagent Communication messages, and retries 
correctly. An external signal is needed because BXUs can generate 
AP-Bus requests internally because of the prefetcher, or their internal 
logic can be tied up handling an IAC request from the AP-Bus. 


SSBUSY 
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Table 11. M82965 BXU Module Support Signals (Continued) 



Symbol 


Type 


Name and Function 




I/O 
O.D. 


POP QUEUE: Is used by the two BXUs acting as bus backups for each 
other to communicate status on the completion of outstanding L-Bus 
requests. Usually, this signal is asserted when the oldest write in the 
queue has completed. During the partner ordering period, a different 
protocol is used to convey the status of all write requests outstanding. 


POPQUE 


LERLi-LERLo 


I/O 
O.D. 


LOCAL ERROR REPORTING LINES: Are identical to the BERL 
signals defined for the AP-Bus, but are used on the module side to 
connect all BXUs on a single L-Bus. 



NOTES: 

I/O = Input/Output, I = Input, O 

MECHANICAL DATA 



Output, O.D. = Open Drain 



Pin Assignment 

The MG82965 BXU (PGA 
viewed from the top side of 
down) is shown in Figure 17 
side (pins up) in Figure 18. 



package) pinout as 
the component (pins 
and from the bottom 



Vcc and GND connections must be made to multi- 
ple Vqc and GND pins. Each Vcc an d GND pin must 
be connected to the appropriate voltage or ground 
and externally strapped close to the package. Pref- 
erably, the circuit board should include power and 
ground planes for power distribution. Table 12 lists 
the function of each pin. 

Many of the signals are multiplexed and several sig- 
nals have different interpretations depending on 
whether the BXU is used in Processor or Memory 
mode. 
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Figure 17. MG82965 BXU Pinout— View from Top Side (Pins Down) 
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Figure 18. MG82965 BXU Pinout— View from Bottom Side (Pins Up) 
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Table 1 2. M82965 PGA Pinout— In Pin Order 


Pin 


Signal 


Pin 


Signal 


Pin 


Signal 


Pin 


Signal 


A1 


LERLi 


C6 


AD22 


H1 


LAD30 


M10 


Vss 


A2 


v S s 


C7 


AD 2 4 


H2 


READY 


M11 


Vcc 


A3 




C8 


AD 29 


H3 


BEl 


M12 


Vref 


POPQUE 


A4 


AD16 


C9 


SPEC 2 


H12 


AD13 


M13 


BERLo 


A5 


AD 20 


C10 


V CC 


H13 


AD15 


M14 


CHKi 


A6 


AD21 


C11 


Vss 


H14 


AD 4 


N1 


LAD23 


A7 


AD 25 


C12 


INITIO 


J1 


LAD 29 


N2 


LAD 24 


A8 


AD 30 


C13 


ARB 2 


J2 


LAD31 


N3 


LAD22 


A9 


AD 26 


C14 


A^ 


J3 


CACHE 


N4 


LAD 21 


A10 


AD 28 


D1 


WDq/UNC 


J12 


BOUT 


N5 


LAD 18 


A11 


SPEC 


D2 


PFETCH 


J13 


COM 


N6 


LAD 15 


A12 


SPEC3 


D3 


Vss 


J14 


AD 8 


N7 


LAD 12 


A13 




D12 


ARB 


K1 


LAD 28 


N8 


LAD 10 


SPEC5 


A14 


Vcc 


D13 


AD 


K2 


LAD 26 


N9 


LAD 6 


B1 


v ss 


D14 


AD5 


K3 


LAD27 


N10 


LAD 2 


B2 


IaCq/eRR 


E1 


CW/DEN 


K12 


BERLt 


N11 


CLK2 


B3 


IACt/FRF 


E2 


WYq/COR 


K13 


AD 14 


N12 


LAD 


B4 


AD 17 


E3 


WYi/MEM 


K14 


AD10 


N13 


RESET 


B5 


AD18 


E12 


AD3 


L1 


ALE 


N14 


Vss 


B6 


AD19 


E13 


AD 7 


L2 


ADS 


P1 


Vcc 


B7 


AD 23 


E14 


ARB3 


L3 


HOLD 


P2 


Vss 


B8 


AD27 


F1 


BE3 


L12 


Vss 


P3 


LAD! 9 


B9 




F2 


BE 2 


L13 


CHK 


P4 


LAD 17 


SPECi 


B10 


AD31 


F3 


CR/DT/R 


L14 




P5 


LAD 16 


MODCHK 


B11 




F12 


AD 6 


M1 


HLDA 


P6 


LAD 14 


SPEC4 


B12 




F13 


AD 9 


M2 


LAD 25 


P7 


LADu 


RPYDEF 


B13 


v ss 


F14. 


ARBi 


M3 




P8 


LADg 


BADAC 


B14 


Vss 


G1 


W/R 


M4 


Vss 


P9 


LAD7 


C1 




G2 


BE 


M5 


Vcc 


P10 


LAD 5 


SSBUSY 


C2 


WD^ECC 


G3 




M6 


LAD 20 


P11 


LAD.4 


LOCK 


C3 


LERLo 


G12 


AD11 


M7 


LAD 13 


P12 


LADi 


C4 


Vcc 


G13 


AD12 


M8 


LAD 8 


P13 


Vss 


C5 


Vss 


G14 


AD 2 


M9 


LAD 3 


P14 


Vcc 



3-305 



Intel. 



M82965 



A®«fl©i OMF©K(MA¥0©M 



Table 13. M82965 Pinout— In Signal Order 


Signal 


PGA 
Pin 


Signal 


PGA 
Pin 


Signal 


PGA 
Pin 


Signal 


PGA 
Pin 


AD 


D13 


ALE 


L1 


LAD 8 


M8 




A1.1 


SPEC 


AD! 


C14 


ARB 


D12 


LADg 


P8 




B9 


SPECi 


AD 2 


G14 


ARB-i 


F14 


LAD 10 


N8 




C9 


SPEC 2 


AD 3 


E12 


ARB 2 


C13 


LADu 


P7 




A1.2 


SPEC 3 


AD 4 


H14 


ARB 3 


E14 


LAD 12 


N7 




B11 


SPEC4 


AD 5 


D14 




M3 


LAD 13 


M7 




A13 


BADAC 


SPEC5 


AD 6 


F12 


BEo 


G2 


LAD U 


P6 




C1 


SSBUSY 


AD 7 


E13 


BE! 


H3 


LAD 15 


N6 


Vcc 


A14 


AD 8 


J14 


BE 2 


F2 


LAD 16 


P5 


Vcc 


C4 


AD 9 


F13 


BE 3 


F1 


LAD 17 


P4 


Vcc 


C10 


AD 10 


K14 




M13 


LAD! 8 


N5 


Vcc 


M5 


BERL 


AD11 


G12 




K12 


LAD 19 


P3 


Vcc 


M11 


BERLi 


AD 12 


G13 , 




J12 


LAD 20 


M6 


Vcc 


P1 


BOUT 


AD 13 


H12 


CACHE 


J3 


LAD21 


N4 


Vcc 


P14 


AP14 


K13 


CHK 


L13 


LAD 22 


N3 


Vref 


M12 


AD15 


H13 


CHKi 


M14 


LAD 23 


N1 


Vss 


A2 


AD16 


A4 


CLK2 


N11 


LAD 24 


N2 


v ss 


B1 


AD 17 


B4 


COM 


J13 


LAD 25 


M2 


v ss 


B13 


AD 18 


B5 


CR/DT/R 


F3 


LAD 26 


K2 


Vss 


B14 


AD19 


B6 


CW/DEN 


E1 


LAD 27 


K3 


Vss 


C5 


AD20 


A5 


HLDA 


M1 


LAD 28 


K1 


Vss 


C11 


AD21 


A6 


HOLD 


L3 


LAD 29 


J1 


Vss 


D3 


AD 2 2 


C6 


IACq/ERR 


B2 


LAD30 


H1 


v ss 


L12 


AD23 


B7 


lAC^FRF 


B3 


LAD 31 


J2 


Vss 


M4 


AD 2 4 


C7 




C12 


LERL 


C3 


Vss 


M10 


INITIO 


AD 2 5 


A7 


LAD 


N12 


LERL r 


A1 


v ss 


N14 


AD 2 6 


A9 


LADi 


P12 




G3 


Vss 


P2 


LOCK 


AD 2 7 


B8 


LAD 2 


N1.0 




L14 


v ss 


P13 


MODCHK 


AD28 


A10 


LAD 3 


M9 




D2 


WDq/UNC 


D1 


PFETCH 


AD29 


C8 


LAD 4 


P11 




A3 


WD^ECC 


C2 


POPQUE 


AD30 


A8 


LAD 5 . 


P10 




H2 


W/R 


G.1 


READY 


AD31 


B10 


LAD 6 


N9 


RESET 


N13 


WYq/COR 


E2 


ADS 


L2 


LAD 7 


P9 




B12 


WYi/MEM 


E3 


RPYDEF 
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Package Dimensions and Mounting 

The MG82965 BXU is packaged in either a 132-lead 
ceramic pin-grid array (PGA) or a 164-pin CQP pack- 
age. (Contact factory for details on CQP availability.) 
Pins in the PGA package are arranged 0.100 inch 
(2.54 mm) center-to-center, in a 14 by 14 matrix, 
three rows around. See Figure 19. 

A wide variety of available sockets allows low-inser- 
tion or zero-insertion force mountings, and a choice 



of terminals such as soldertail, surface mount, or 
wire-wrap. Figure 20 shows several applicable sock- 
ets. 



Package Thermal Specification 

The M82965 BXU is specified for operation when its 
case temperature is within the range of -55°C to 
+ 125°C. The PGA case temperature should be 
measured at the center of the top surface opposite 
the pins as shown in Figure 21. 
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Figure 19. A 132-Lead Pin-Grid Array (PGA) Used to Package the MG82965 BXU 
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force (LIF) soldertail 



• Low insertion 
55274-1 

• Amp tests indicate 50% reduction in in- 
sertion force compared to machined 
sockets 

Other socket options 

• Zero insertion force (ZIF) soldertail 
55583-1 

• Zero insertion force (ZIF) Burn-in version 
55573-2 

Amp Incorporated 

(Harrisburg, PA 17105 U.S.A 
Phone 717-564-0100) 



Peel-A-Way* and Kapton Sock- 
et Terminal Carriers 

• Low insertion force surface 
mount CS132-37TG 

• Low insertion force soldertail 
CS1 32-01 TG 

• Low insertion force wire-wrap 
CS132-02TG (two-level) 
CS132-03TG (three-level) 

• Low insertion force press-fit 
CS132-05TG 

Advanced Interconnections 
(5 Division Street) 
Warwick, Rl 02818 U.S.A. 
Phone 401-885-0485) 




Amp LIF Socket 
55274-1 



Amp LIF Socket 



Cam handle locks in low profile position when MG82965 is installed 
(handle UP for open and DOWN for closed positions). 

Courtesy Amp Incorporated 



Peel-A-Way Carrier No. 132: 
Kapton Carrier is KS132 
Mylar Carrier is MS132 

Molded Plastic Body KS132 is 

shown below: 



FOOT PRINT NO. 132 




HK.100TYP 
14x14x3 ROWS 



271082-22 



SOLDER TAIL -01 LOW PROFILE -04 




WIRE WRAP -02/-03 



SOLDER TAIL -33 



.020 ^ r 



PRESS FIT -05 




SURFACE MOUNTING -37 



Courtesy Advanced Interconnections 

(Peel-A-Way Terminal Carriers 

U.S. Patent No. 4442938) 



* Peel-A-Way is a trademark of Advanced Interconnections. 



Figure 20. Several Socket Options for Mounting the M82965 BXU 
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. MEASURE CASE TEMPERATURE 
AT CENTER OF TOP SURFACE 




132 -PIN PGA 




Figure 21. Measuring MG82965 Case Temperature 



ELECTRICAL SPECIFICATIONS 



Power and Grounding 

The M82965 is implemented in CHMOS III technolo- 
gy and has modest power requirements. Its high 
clock frequency and numerous output buffers (ad- 
dress/data, control, error, and arbitration signals) 
can cause power surges as multiple output buffers 
drive new signal levels simultaneously. For clean on- 
chip power distribution at high frequency, seven Vcc 
and thirteen Vss P>ns separately feed functional 
units of the M82965. 

Power and ground connections must be made to all 
Vcc and V S s Pins of the M82965. On the circuit 
board, all Vcc P'ns must be strapped closely togeth- 
er, preferably on a Vcc plane. Likewise, all Vss P'ns 
should be strapped together, preferably on a ground 
plane. 



Power Decoupling Recommendations 

Liberal decoupling capacitance should be placed 
near the M82965. The BXU when driving its two 32- 



bit address/data buses (AP-Bus and L-Bus) can 
cause transient power surges, particularly when driv- 
ing large capacitive loads. 

Low inductance capacitors and interconnects are 
recommended for best high frequency electrical per- 
formance. Inductance can be reduced by shortening 
the board traces between the BXU and decoupling 
capacitors as much as possible. 



Connection Recommendations 

For reliable operation, always connect unused in- 
puts to a n approp riate signal level. In particular, if 
PFETCH or LERLo_i are not used, they should be 
pulled up and if the CACHE input is not used (i.e., 
BXU operating in the Memory mode) it should be 
tied low through a 10 kn resistor. No inputs should 
ever be left floating. 

All open-drain outputs require a pullup device. While 
in most cases a simple pullup resistor will be ade- 
quate, a network of pullup and pulldown resistors 
biased to a valid Vm (e.g., 3.5V) will limit noise and 
AC power consumption, especially on the AP-Bus. 
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ABSOLUTE MAXIMUM RATINGS" 

Case Temperature 

under BiasO) -55°C to + 125°G Case 

Storage Temperature - 65°C to + 1 50°C 

Voltage on Any Pin -0.5V to Vcc + 0.5V 

Power Dissipation . 2.5W 



Operating Conditions 



NOTICE: This data sheet contains information on 
products in the sampling and initial production phases 
of development. The specifications are subject to 
change without notice. Verify with your local Intel 
Sales office that you have the latest data sheet be- 
fore finalizing a design. 



* WARNING: Stressing the device beyond the "Absolute 
Maximum Ratings" may cause permanent damage. 
These are stress ratings only. Operation beyond the 
"Operating Conditions" is not recommended and ex- 
tended exposure beyond the "Operating Conditions" 
may affect device reliability 



Symbol 


Description 


Min 


Max 


Units 


T C 


Case Temperature (Instant On) 


-55 


+ 125 


°C 


Vcc 


Digital Supply Voltage 


4.75 


5.25 


V 



Table 14. D.C. Characteristics (Over Specified Operating Conditions) 



Symbol 


Parameter 


Min 


Max 


Units 


Comments 


V|L 


Input Low Voltage 


-0.3 


+ 0.8 


V 




V, LA 


Input Low Voltage: AP-Bus 


-0.5 


+ 1.0 


V 




V| H 


Input High Voltage 


2.0 


V C C + 0.3 


V 




V|HA 


Input High Voltage: AP-Bus 


2.0 


Vcc 


V 




VREF/C 


V REF Trip Point Cold Start 


Vcc -0.7 




V 




VreF/W 


Vref Trip Point Warm Start 


1.7 


1.8 


V 




V C L 


CLK2 Input Low Voltage 


-0.3 


+ 1.0 


V 




V C H 


CLK2 Input High Voltage 


0.55 V CC 


Vcc 


V 




Vol 


Output Low Voltage: 
Iql = 4 mA: LAD Lines 
Iol = 5 mA: Controls( 2 ) 
Iol = 25 mA: L-Bus 

Open-Drain 
Outputs 
Iol = 80 mA: AP-Bus 

Open-Drain 
Outputs 




0.45 
0.45 
0.45 

0.70 


V 
V 
V 

V 




V H 


Output High Voltage: 
Ioh = 1 m A: LAD Lines 
Iqh = 0-9 mA: Controls(2) 
Ioh = 5.0 mA: ALE 


2.4 
2.4 
2.4 




V 
V 
V 




! CC 


Power Supply Current 




450 


mA 




Ili 


Input Leakage Current 




±15 


juA 


OV^Vq^Vcc 


Ilo 


Output Leakage Current 




±15 


JLlA 


0.45V <; v < V C c 


C IN 


Input Capacitance 




10 


PF 


Note 1 


Co 


I/O or Output Capacitance 




12 


PF 


Note 1 


Cclk 


Clock Capacitance 




12 


PF 


Note 1 



NOTES: 

1. Test frequency = 1 MHz, Tc = 25°C, unmeasured pins at GND. 

2. "Controls" include all L-Bus I/O pins not otherwise specified. 
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A.C. SPECIFICATIONS 

This section describes the A.C. specifications for the 
M82965 pins. All input and output timings are speci- 
fied relative to the 1.5V level of the rising edge of 
CLK2, and refer to the time at which the signal 
reaches (for output delay and input setup) or leaves 
(for hold time) the TTL levels of LOW (0.8V) or HIGH 
(2.0V). 

All A.C. testing should be done with input voltages of 
0.45V and 2.4V. 



Maximum output hold times are the same as mini- 
mum output delays. Tri-state signals have no resis- 
tive load or termination. 

The Output Delay specified for open-drain signals 
includes both the low to high and high to low tran- 
sitions. The float delay is the amount of time that the 
pulldown transistor may remain active. This specifi- 
cation is provided to help system designers calcu- 
late propagation delay for terminations other than 
the one used for testing. 
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Table 15. M82965 A.C. Timing Specifications (Over Specified Operating Conditions) 



Symbol 


Parameter 


Min 


Max 


Units 


Comments 


Ti 


Clock Period 


31.25 


125 


ns 


V| N = 1.5V 


T 2 


Clock Low Time 


11 




ns 


V| N = 10% Point = 1.2V 


T 3 


Clock High Time 


11 




ns 


Vim = 90% Point = 0.1V + 0.5 Vcc 


T 4 


Clock Fall Time 




10 


ns 


V|n = 90% Point to 10% Point 


T 5 


Clock Rise Time 




10 


ns 


Vin = 10% Point to 90% Point 


T 6 


Output Valid Delay: 

LAD 

WY 

CW, WD, SS Busy 

CR 

Controls^,) 


4 
4 
4 
4 
2 


35 
35 
30 
45 
35 


ns 
ns 
ns 
ns 
ns 


C L = 100 pF 
C L =125pF 
C L = 75 pF 
C L = 75 pF 
C L = 75 pF 


T 7 


ALE Width 


15 




ns 


C L = 75 pF 


T 8 


ALE Invalid Delay 




20 


ns 


C L = 75 pF 


T 9 


Output Float Delay: 

LAD 

WY 

Controls^) 


5 
5 
5 


20 
22 
22 


ns 
ns 
ns 


C L = 100 pF 
C L =125pF 
C L = 75 pF 


T10 


Input Setup Time: 


8 
15 
3 




ns 
ns 
ns 


10% Point 
10% Point 
10% Point 


LOCK, HOLD, HOLDAR, READY 

ECC, UNG 

Controls^) 


T11 


Input Data Hold 


10 




ns 


90% Point 


T12 


Setup to ALE Inactive 


10 




ns 


C L = 100pF(LAD) 
C L = 75 pF (Controls) 


T13 


Hold after ALE Inactive 


8 




ns 


C L = 100pF(LAD) 
C L = 75 pF (Controls) 


T14 


RESET Hold 


5 




ns 




T15 


RESET Setup 


8 




ns 




Tie 


RESET Width 


1250 




ns 


40 CLK2 Periods Minimum 


T17 


Clock to Data Valid 
(AP-Bus) 




17 


ns 


C L = 50 pF 
Iol = 50 mA 


Tie 


Clock to High 
Impedance (AP-Bus) 




14 


ns 




T19 


Output Hold 
(AP-Bus) 


5 




ns 


C L = 50 pF 
Iql = 50 mA 


T 20 


Input Setup (AP-Bus) 


7 




ns 




T 2 1 


Input Hold (AP-Bus) 


10 




ns 





NOTE: 

1. "Controls" 



include all L-Bus I/O pins not otherwise specified. 
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HIGH LEVEL (MIN) 0.55V CC ■ 



LOW LEVEL (MAX) 1 .OV 






271082-26 



Figure 22. CLK2 Timing 



J — LJ — t_J — i—J — LJ — i_J i_ 



OUTPUTS: 

J™S XXXX X XXX X ^l 

BADACJAC^IACq 



ifcsxssss 



s ^i'^o. YWWYYWYYYY^h 

wD 1t wD AAAAAAAAAAAAA/l o. 



J 



Un 



amssB 



oILWVYVYvSA 



^LXXXXXX 



INPUTS: 

BE 3 -BE , 

LOCK, ADS 



Mffi 



^,^0 xxxxxxOT" vauds . amp *' ^jfiRfflxai; ^\»*» ffjjfjoTO? 

271082-27 

*NOTE: 

LERL signals must be asserted at both edges A2 and A3 in order for them to be recognized by the BXU. 



Figure 23. Drive Levels and Measurement Points for A.C. Specifications. 
L-Bus Timings for the BXU as a Bus Slave 
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)r~^LJr^i-Jr^LJr^i ir^L. 



OUTPUTS: 

LAD, « -LAD, 



'6-T 



a^.^d-en : XXXXXXXXXX ig: 



CACHE, HO LDR.F RF.COR.MEM, 
W/R, LOCK, HLDA, HOLDR 



VALID OUTPUT 



Xi 



2.0V 
0.8V 



2.0V 8 I 

r 0.8V ./ 

' I T 7 -*| 



i cmmz 



DT/R 



xxx xxx x xxxxxx x M 



2.0V 
0.8V 



VALID OUTpUT 






20V "VW 
O.BVj^M 



VALID OUTPUT 



INPUTS: 

UD 31 - LAD , 

READY, ERR, 

ECC, UNC, 



LOCK, HOLD. 
HOLDAR 



LERL^LERLq 



VVVVV-2.0V 2.ov4/yvvv 

AAAAA-°- 8v \ ° 8V AWA 

"^ T io"*r- T ii' 




^1 

2.0V "V 

- 8V ^A 



MfflmMlIlMK 



*NOTE: 

LERL signals must be asserted at both edges A2 and A3 in order for them to be recognized by the BXU. 



Figure 24. Drive Levels and Measurement Points for A.C. Specif ications. 
L-Bus Timings for the BXU as a Bus Master 
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Figure 25. Relative Timing for L-Bus Signals 
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*NOTE: 

BERL signals must be asserted at both edges A2 and A3 in order for them to be recognized by the BXU. 



Figure 26. Relative Timing for AP-Bus Signals 
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Figure 27. RESET Setup and HOLD Timing 
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L-BUS DESIGN CONSIDERATIONS 

Input hold times can be disregarded by the designer 
whenever the input is removed beca use o f a subse- 
quent output from the BXU (e.g., DEN becomes 
deasserted). In other words, whenever the BXU gen- 
erates an output that indicates a transition into a 
subsequent state, the BXU will have sampled any 
inputs for the previous state. 

As an example, in the recovery (T r ) c ycle f ollowing a 
read, the minimum time (Xq Min) tnat DEN becomes 
asserted is specified to be less tha n the minimum 
hold time on the data (ti 1 Min)- When DEN is assert- 
ed, however, the data is guaranteed to have been 
sampled. 

Similarly, whenever the BXU generates an output 
that indicates a transition to a subsequent state, any 
outputs that are specified to be tri-stated in this new 
state will be tri-stated. 

For example, in the data (T<j) cycle following an ad- 
dress (T a ) cyc le for a read, the minimum output de- 
la y (*6 Min) of DEN is specified to be less t han t he 
maximum float time of LAD (tg Max)- When DEN is 
asserted, however, the LAD outputs are guaranteed 
to have been tri-stated. 



AP-BUS SIGNAL TIMING 
CONSIDERATIONS 

The AP-Bus uses three-quarter cycle signaling for 
data transmission. Data is driven on edge D and 
sampled on edge C. This approach allows three- 
quarters of the bus cycle to be used for data trans- 
mission. 

The remaining (one-quarter) time allows for clock 
skew and signal hold time. All AP-Bus signals except 
for the ARB, CHK, and BERL signals use this timing. 
The relationship of the AP-Bus signals is shown in 
Figure 28. 



The CHK signals (interlaced parity) are delayed by 
one-half cycle or one phase to allow for generation 
of parity from the internal data that is being transmit- 
ted. The CHK lines are sampled one phase after the 
data has been sampled and compared against the 
parity generated for the received data. 

Most input signals on the AP-Bus are sampled on 
the rising edge of CLK2 at edge C. The exceptions 
are the error signals CHK, BERL and ARB, which 
are sampled on the rising edge of CLK2 at edge A. 
Regardless of the edge, the setup and hold times 
are the same. 

All outputs on the AP-Bus are driven relative to the 
falling edge of CLK2 at the middle of phase 2, ex- 
cept CHK, BERL and ARB, which transition on the 
falling edge of CLK2 at the middle of phase 1 . 

When designing a system based on the AP-Bus, the 
system topology will be limited by the available prop- 
agation time for signals in the system. The propaga- 
tion time must allow for settling of ringing, ground 
shift, and crosstalk, all of which are dependent on 
board and system materials and design. 

The following equation gives the propagation time 
available, given a specific clock implementation and 
frequency: 

Tprop = 2Ti - (T 3 + T 4 + T 5 + (T 18 or T 19 ) + T 10a + T skew ) 

Where T S k ew is the worst case clock skew between 
BXUs (clock skew is the time delay between any two 
clocks in the system due to physical distribution lim- 
its). 

In AP-Bus systems, this skew is defined as follows: 

T skew ^ T 3 + T 2 o - Ti -i 



L-Bus Waveforms 

Figures 30 through 36 illustrate the relationship of L- 
Bus signals during a variety of bus transactions. For 
a detailed discussion of the operation of the L-Bus, 
consult the 80960MC Hardware Designer's Refer- 
ence Manual. 
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Figure 28. AP-Bus Signal Timing 
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Figure 29. System and Processor Clock Relationship 
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Figure 30. L-Bus Read Transaction 
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Figure 31. L-Bus Write Transaction 
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Figure 32. L-Bus Burst Read Transaction 
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Figure 33. L-Bus Burst Write Transaction with One Wait State 
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Figure 34. Hold Timing 
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Figure 35. Interrupt Acknowledge Transaction 
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Figure 36. Bus Exchange Transaction (PBM = Primary Bus Master, SBM = Secondary Bus Master) 
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Figure 1. Pinout Diagram 
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GENERAL DESCRIPTION 

The Intel 85C960 is a single-chip burst/ready/de- 
code juPLD (Microcomputer Programmable Logic 
Device) designed to interface 80960 KA/KB embed- 
ded controllers to system memory and I/O. The 
85C960 provides programmable chip selects, a pro- 
grammable read/write access wait state/ready gen- 
erator, and burst address (A2, A3) cycling. Burst 
transaction cycling of A2, A3, and WCLK# (Write 
Clock) is also supported for intelligent peripherals on 
the bus. 

For its programmable functions, the 85C960 uses 
advanced EPROM cells as logic array and wait-state 
table memory elements. Coupled with Intel's propri- 
etary CHMOS HIE technology, the result is a pro- 



grammable device able to support Intel's 32-bit 
80960 KA/KB embedded controllers at speeds up to 
25 MHz. 



ARCHITECTURE DESCRIPTION 

The 85C960 julPLD integrates burst control, ready 
generation, and chip select decoding into a single 
device. Figure 2 shows the architecture of the 
85C960. Table 1 lists and describes each signal on 
the device. The 85C960 replaces 6-10 separate 
PLD/discrete logic devices in small- and medium- 
sized 80960 systems. For medium- to large-sized 
systems, the 85C960 can be supplemented with an 
additional decoder, such as the 85C508, and a sec- 
ond 85C960. Figure 3 shows a single 85C960 in a 
typical application. 
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Figure 2. 85C960 Block Diagram 
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Table 1. 85C960 Pin Descriptions 



Symbol 


Type 


Name and Function 


RESET 




RESET. When RESET is high for a minimum of four CLK2 cycles, internal 
circuits are reset to a known state. 


I7-I0 




INPUT 7- INPUT 0. These are the address range inputs to the 
programmable decode logic array. . 


CLK2 




SYSTEM CLOCK. This input, which connects to the 80960 CLK2 signal, 
provides the timing reference for all 85C960 operations. 


AD3-AD0 




ADDRESS IN 3-ADDRESS IN 0. These inputs are driven by LAD0-LAD3 
from the Local Bus (L-Bus) to provide addressing and burst access decode 
information. 


W/R# 




WRITE/READ. Write/Read from controller. When low, indicates that the 
current access is a read. When high, indicates that the current access is a 
write. 


DEN# 




DATA ENABLE. This input from the controller indicates that data is present 
on the L-Bus. 


ADS# 




ADDRESS/DATA STROBE. This input from the 80960 indicates whether 
address or data information is currently on the L-Bus. When low, address 
information is changing. The 85C960 chip select timing is based in part on 
ADS # low during Ta states. 


BLAST # 





BURST LAST. This signal, when low, indicates that the current read/write 
access is the last access in a burst transaction. BLAST # is not cycled if 
RDY# is generated off-chip. 


WCLK# 





WRITE CLOCK. This output provides a write enable strobe to memories that 
do not support burst mode access. 


A3,A2 





ADDRESS OUT 3, 2. These outputs cycle during burst transactions. 
Typically connected to lowest memory address signals. 


CS3#-CS0# 





CHIP SELECT 3-CHIP SELECT 0. Single p-term select outputs that are 
driven active (low) for the programmed address condition on I7-I0. 


RDY# 


I/O 


READY. RDY# is an active low, bidirectional, open-drain signal that should 
be connected to the controller's Ready input. As an output, RDY# goes high 
to cause the controller to extend the current access. RDY# goes low to 
indicate that the data on the L-Bus bus may be sampled (read) or removed 
(write). RDY# is controlled by the 85C960 Ready Generation and Wait-State 
Logic. The open-drain output allows RDY# to be OR-tied to other circuitry 
that may drive the controller's Ready input. As a bidirectional input, RDY# 
allows the 85C960 to provide Ready timing and burst cycling for intelligent 
peripherals that do not generate these signals themselves. 
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80960 L-Bus (Local Bus) cycles are monitored by 
the Bus State Tracker to synchronize the functional 
blocks in the 85C960 to the L-Bus. CLK2 provides 
the timing reference for all 85C960 operations. 

Four external chip selects (CS0#-CS3#) are gen- 
erated by the programmable Chip Select Decoder. 
These four signals provide decoded selects to mem- 
ory and I/O devices and are routed to the program- 
mable Wait-State Table so that the 85C960 can 
generate RDY# at the appropriate time. Four addi- 
tional selects are decoded (internal only) and routed 
to the Wait-State Table so that the 85C960 can gen- 
erate RDY# for up to four additional address 
ranges. 

The Ready Generation block generates RDY# to 
the controller under control of the Wait-State Table. 
Depending on the contents programmed into this ta- 
ble and the current type of access, from 0-15 wait 
states can be introduced into each bus cycle. An 
independent wait state value can be chosen for 
each select and each access type. Four access 
types are possible: read first, read subsequent, write 
first, and write subsequent. 

The Burst Control and Address Counter blocks 
control burst transaction timing to memory and I/O. 
Note that the RDY# pin is sampled by the Burst 
Control block to allow the 85C960 to generate burst 
transaction timing for other bus peripherals. WCLK# 
provides a write enable strobe for memory and I/O 
that do not support burst mode. BLAST # informs 
burst-mode devices that the current access is the 
last one in a burst transaction. A2 and A3 are cycled 
to select the address location for each access. 



FUNCTIONAL DESCRIPTION 

The following paragraphs provide a detailed descrip- 
tion of each functional block in the 85C960 jmPLD. 



Chip Select Decoder 

The Chip Select Decoder, shown in Figure 4, is a 
high speed, single p-term (product-term) latched de- 
coder circuit with eight inputs (10-17) and eight 
latched outputs. Each output goes low when its as- 
sociated product term is true. Four of these outputs 
(CS0#-CS3#) are available externally to be used 
as device selects. The remaining four outputs 
(CS4#-CS7#) are available internally so that the 
85C960 can provide ready and burst timing for four 
more device selects. (The actual selects for these 
four additional devices/resources must be generat- 
ed by external logic.) 

The input to each latch is a single NAND p-term that 
can be connected to the dedicated inputs. The true 



and complements of all inputs (17-10) are available 
to all eight NAND p-terms. 

Each intersecting point in the logic array is connect- 
ed or not connected based on the value pro- 
grammed in the EPROM array. Initially (EPROM 
erased state), no connections exist between any 
p-term and any input. Connections can be made by 
programming the appropriate EPROM cells. Since 
p-terms are implemented as NANDs, a true condi- 
tion on a p-term drives the output low. Current con- 
sumption is higher when both true and complement 
p-terms for the same input are programmed. 

Selects are latched on the falling edge of an internal 
Latch Enable (LE), which is generated from ADS#, 
DEN#, and CLK2. The proper combination of these 
signals occurs during an 80960 address state (Ta). 
Figure 5 shows the relationship of the internal LE 
and external chip selects to the three signals at the 
end of a Ta state. All selects are cleared to an inac- 
tive high state at the start of a recovery state. (Tr). 
All eight selects (four external and four internal) are 
routed to the Wait-State Table. 



Wait State Table 

Chip selects, WR (Write/ Read), and SW (Subse- 
quent Word) feed the Wait-State Table. Each chip 
select points to a set of four wait state values while 
WR and SW determine which of the four values to 
route to the Ready Generation block (see Figure 6). 
The four values are grouped into read and write 
groups with each group having a value for the first 
access and subsequent access (second through 
fourth). The four-bit wait-state value is sent to the 
Ready Generation block (via WS0#-WS3#) to be 
used as an initial count value. If two selects are ac- 
tive, the resulting count value is the logical bit AND 
of the two individual values. If more than two selects 
are active and the individual count values are not the 
same, the resulting count value is indeterminate. If 
no select is active, no count value is loaded (and the 
Ready Generation circuit is disabled). 



Ready Generation 

RDY# is high at the start of each burst transaction. 
The RDY Generator begins to count down from the 
wait state value, decrementing the counter at the 
start of each wait state. When the internal counter 
reaches 0000, RDY# is pulled low (CLK2c during 
the data state). On the next CLK2c edge (for a wait 
state), RDY# is released, allowing an external resis- 
tor to pull RDY# high. Figure 7 shows the timing for 
a four-word burst write transaction with 1 wait state 
for the first access and wait states for the remain- 
ing three accesses (Burst Write 1-0-0-0). 
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RDY# is an open-drain I/O pin, which must be con- 
nected to pullup and pulldown resistors as shown in 
Figure 8. During a wait-state access, RDY# is pulled 
high to cause the controller to extend the current 
access so that the memory or peripheral chip has 
time to present data to the bus (read), or sample 
data on the bus (write). RDY# is released on the 



CLK2a edge of a Tr state. If a Read or Write access 
occurs without a chip select having been decoded 
on-chip, the RDY# output buffer is disabled and 
RDY# is sampled as an input. This allows the 
85C960 to cycle A2, A3, and WCLK# to provide 
burst transaction timing for other bus controllers. 
RDY# may be OR-tied with other bus controllers so 
they can access the processor Ready signal. 



»D-hT 












c © o 




r\> 






> 














l_J° 


LATCH 
G 


' — cso# 


K_f% 




r~\> 




















i_j° 

< 

rv 




LATCH 
G 


L— csi# 


ii o-h^ 






K. f\ 






















» 




r~\) 




LATCH 
G 


I— CS2# 


,2c^-h£- 












• 






























!_J° 

< 




LATCH 
G 

LATCH 
G 

LATCH 
G 

LATCH 
G 


L— CS3# 

CS4# 

■ -CS5# 

CS6# 

■ CS7# 

290192-5 


































< 

FY 


















l_J° 

> 

r~v 




l7 0-[^^ 


































DIVI 


DE 












i_J° 




LATCH 
G 




BY 2 




: 


\ LE 




AUo# |_^...i— ii. - s 


J 




DCN# [^►■'■ i 















Figure 4. 85C960 Chip Select Decoder Block 
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CLK2 



ADS 



DEN 



LE 
(INTERNAL) 



CHIP SELECT ACTIVE 
(BASED ON 10-17) 




c 



•*d LATCH 

"^ OPEN 



v 



H 



Latch opens when CLK2 and DEN# go high and ADS# goes low. 
Latch closes when DEN# goes low or ADS# or CLK2 go high. 



Figure 5. Internal LE and External Chip Select Timing 



Burst Transactions 

AD3, AD2 are latched to indicate the starting ad- 
dress of a burst transaction. The 85C960 places 
these two signals out on A3 and A2, respectively, 
then cycles the two addresses upward until the last 
access of the burst. The 85C960 assumes that the 
processor handles splitting of the burst transaction 
when a 16-byte boundary is crossed. 

ADO and AD1 specify the size of the burst transfer in 
double-words as shown in Table 2. 

Table 2. AD0-AD1 vs Burst Size 



AD1 


ADO 


No. of 
Words Transferred 




1 
1 



1 


1 


1 
2 
3 
4 



WCLK#, BLAST # Generation 

WCLK# is the write enable signal for writing to non- 
burst mode memories. When low, address outputs 
A2 and A3 are valid. Its trailing edge (low-to-high 
transition) can be used to latch data into non-burst 
mode memories. WCLK# is only provided during 
writes; during reads, WCLK# remains high. 

BLAST # indicates that the current access is the last 
access in a burst transaction. BLAST# is used by 
burst-mode memories to reset internal address 
counters. BLAST# is not cycled when RDY# is gen- 
erated off-chip. 



POWER-ON CHARACTERISTICS 

85C960 inputs and outputs begin responding 1 jus 
(max.) after Vcc power-up (Vcc = 4.75V) or after a 
power-loss/power-up sequence. RESET must be 
synchronous to CLK2 and must be held high for a 
minimum of 4 clock cycles after Vcc reaches 4.75 V. 
After 4 clock cycles, A2 and A3 are high, CS0#- 
CS3# (and CS4#-CS7#), BLAST#, WCLK# are 
high, and the open drain RDY# signal is inactive. 
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Select 
CSOf# 


Write/Read 


WR = 
(Read) 


WR = 1 
(Write) 


sw=o 

(First Word) 


msb Isb 
0000 


msb Isb 
0000 


SW=1 
(Subsequent Word) 


msb Isb 
0011 


msb Isb 
0010 



msb = most significant bit 
Isb = least significant bit 



Figure 6. Example Wait-State Entries for CSOf # 



ERASURE CHARACTERISTICS 



Erasure time for the 85C960 is 20 minutes at 
12,000 ju Wsec/cm 2 with a 2537A UV lamp. 



Erasure characteristics of the device are such that 
erasure begins to occur upon exposure to light with 
wavelengths shorter than approximately 4000A. It 
should be noted that sunlight and certain types of 
fluorescent lamps have wavelengths in the 3000A- 
4000A range. Data shows that constant exposure to 
room level fluorescent lighting could erase the typi- 
cal 85C960 in approximately two years, while it 
would take approximately two weeks to erase the 
device when exposed to direct sunlight. If the device 
is to be exposed to these lighting conditions for ex- 
tended periods of time, conductive opaque labels 
should be placed over the device window to prevent 
unintentional erasure. 

The recommended erasure procedure for the 
85C960 is exposure to shortwave ultraviolet light 
with a wavelength of 2537A. The integrated dose 
(i.e., UV intensity x exposure time) for erasure 
should be a minimum of fifteen (15) Wsec/cm 2 . The 
erasure time with this dosage is approximately 20 
minutes using an ultraviolet lamp with a 12,000 jxW/ 
cm 2 power rating. The device should be placed with- 
in 1 inch of the lamp tubes during exposure. The 
maximum integrated dose the 85C960 can be ex- 
posed to without damage is 7258 Wsec/cm 2 (1 
week at 12,000 ju-W/cm 2 ). Exposure to high intensity 
UV light for longer periods may cause permanent 
damage to the device. 



LATCH-UP IMMUNITY 

All of the input, output, and clock pins of the device 
have been designed to resist latch-up which is inher- 
ent in inferior CMOS processes. The 85C960 is de- 
signed with Intel's proprietary 1 -micron CHMOS 
EPROM process. Thus, each of the pins will not ex- 
perience latch-up with currents up to ±1 00 mA and 
voltages ranging from -0.5V to (Vcc + 0.5V). The 
programming pin is designed to resist latch-up to the 
13.5V max. device limit. 



DESIGN RECOMMENDATIONS 

For proper operation, it is recommended that all in- 
put and output pins be constrained to the voltage 
range GND < (V|n or Vqut) ^ V CC- All unused in- 
puts should be tied high or low to minimize power 
consumption (do not leave them floating). Unused 
outputs may be left floating. A high-speed ceramic 
decoupling capacitor of at least 0.2 ju,F must be con- 
nected directly between the Vcc and GND pin. 

As with all CMOS devices, ESD handling procedures 
should be used with the 85C960 to prevent damage 
to the device during programming, assembly, and 
test. 



FUNCTIONAL TESTING 

Since the programmable sections of the 85C960 are 
controlled by EPROM elements, the device is com- 
pletely testable during the manufacturing process. 
Each programmable EPROM bit controlling the in- 
ternal logic is tested using application independent 
test patterns. EPROM cells in the device are 100% 
tested for programming and erasure. After testing, 
the devices are erased before shipments to the cus- 
tomers. No post-programming tests of the EPROM 
array are required. 

The testability and reliability of EPROM-based pro- 
grammable logic devices is an important feature 
over similar devices based on fuse technology. 
Fuse-based programmable logic devices require a 
user to perform post-programming tests to insure 
device functionality. During the manufacturing pro- 
cess, tests on fuse-based parts can only be per- 
formed in very restricted ways in order to avoid pre- 
programming the array. 
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Figure 7. Burst Write Transaction (1-0-0-0) 
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lOL = 28.8 mA 






Vqh = 3.0V 







Figure 8. RDY# Pullup/Pulldown Resistors 



IN-CIRCUIT RECONFIGURATION 

The 85C960 allows in-circuit configuration changes 
after the device has powered up. At power-up, the 
device is configured according to the information 
programmed into the EPROM cells. After power-up, 
new information can be shifted in on select pins to 
alter device configuration. The new configuration is 
retained until the device is powered down or until the 
information is overwritten by another configuration 
change. 



Note that in-circuit configuration changes^ allow "on- 
the-fly" changes to be made, but do not alter 
EPROM cell data. At the next power-up, the device 
will be configured according to the original data pro- 
grammed into the EPROM cells. In-circuit reconfigu- 
ration requires additional circuitry external to the 
85C960. For details on in-circuit configuration 
changes, refer to AP-337, in-Qircuit Reconfiguration 
of 85C960 and 85C508 \iPLDs, order number 
292072. 



DESIGN SOFTWARE 

Software support is provided by version 2.1 (or later) 
of iPLS II (Intel Programmable Logic Software II). 
Programming is supported on the iUP-PC PC-based 
programmer or iUP-200A/201A Universal Program- 
mer via the GUPI base module and the GUPI 
85EPLD28 programming adaptor. 

For detailed information on iPLS II, refer to the 
iPLDS II Data Sheet, order number: 290134. The 
tools section of the Programmable Logic handbook 
contains a complete listing of all design tools for In- 
tel EPLDs. 



ORDERING INFORMATION 



80960KA/KB 
Clock Frequency 


jaPLD Order Code 


Package 


Operating Range 


20 MHz 


*D85C960-20 


CERDIP 


Commercial 


N85C960-20 


PLCC 


25 MHz 


*D85C960-25 


CERDIP 


Commercial 


N85C960-25 


PLCC 



"Only windowed CERDIP allows UV-erase. 
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ABSOLUTE MAXIMUM RATINGS* 

Supply Voltage (V C c)( 1 > -2.0V to + 7.0V 

Programming Supply 

Voltage (V PP )0) -2.0V to + 13.5V 

D.C. Input Voltage (V|)0. 2) . . . -0.5V to V C c + 0.5V 

Storage Temperature (T stg ) -65°C to + 150°C 

Ambient Temperature (T A )( 3 ) - 10°C to + 85°C 

NOTES: 

1. Voltages with respect to GND. 

2. Minimum D.C. input is -0.5V. During transitions, the in- 
puts may undershoot to -2.0V or overshoot to +7.0V for 
periods of less than 20 ns under no load conditions. 

3. Under bias. Extended Temperature versions are also 
available. 



NOTICE: This is a production data sheet. The specifi- 
cations are subject to change without notice. 



* WARNING: Stressing the device beyond the "Absolute 
Maximum Ratings" may cause permanent damage. 
These are stress ratings only. Operation beyond the 
"Operating Conditions" is not recommended and ex- 
tended exposure beyond the "Operating Conditions" 
may affect device reliability. 



RECOMMENDED OPERATING CONDITIONS 



Symbol 


Parameter 


Min 


Max 


Units 


v cc 


Supply Voltage 


4.75 


5.25 


V 


V|N 


Input Voltage 





v cc 


V 


v 


Output Voltage 





v C c 


V 


T A 


Operating Temperature 





+ 70 


°c 
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D.C. CHARACTERISTICS (T A = o°Cto +70°c, V C c = 5.0V ± 5%) 



Symbol 


Parameter 


Min 


Typ 


Max 


Unit 


Test Conditions 


V,hi(4) 


High Level Input Voltage 
(All Inputs except for 
ADS#,AD0-AD3,DEN#, 
andW/R#) 


2.0 




V C C + 0.3 


V 




V|H2< 4 > 


High Level Input Voltage 

forADS#,AD0-AD3, 

DEN#,andW/R# 


2.2 






V 




V IL (4) 


Low Level Input Voltage 


-0.3 




0.8 


V 




VOH 


High Level Output Voltage 


2.4 






V 


Ioh = -4.0 mA D.C, 
Vcc = Min. 


VOL1 


Low Level Output Voltage 






0.4 


V 


Iol = 4.0 mA D.C, Vcc = Min., 
C L = 30 pF 


VOL2 


Low Level Output Voltage 
for A2, A3 






0.45 


V 


Iql = 24 mA D.C, V C c = Min., 
C L = 60 pF 


VOL3 


Low Level Output Voltage 
for Open Drain (RDY#) 






0.5 


V 


Iol = 30 mA D.C, V C c = Min., 
C L = 30 pF 


l| 


Input Leakage Current 






±10 


jLtA 


Vcc = Max., 
GND < V| N <, V C c 


"oz 


Output Leakage Current 






±10 


JLtA 


Vcc = Max., 

GND < Vqut ^ V C c 


lsc< 5 > 


Output Short Circuit Current 


-30 




-90 


mA 


V CC = Max., Vqut = 0.5V 


ice 


Power Supply Current 




10 


50 


mA 


V C c = Max., Vin = Vcc o r Q ND, 
No Load, CLK2 = 50 MHz 



NOTES: 

4. Absolute values with respect to device GND; all over and undershoots due to system or tester noise are included. 

5. Not more than 1 output should be tested at a time. Duration of that test should not exceed 1 second. 



A.C. TESTING LOAD CIRCUIT (RDY#) 



OUTPUT ^ 



I 



D1 

44- 



D2 



See D.C. Characteristics Table for Current and Capaci- 
tance Specifications. 
D1 and D2 are matched. 



A.C. TESTING LOAD CIRCUIT 
(ALL OUTPUTS EXCEPT RDY#) 




<^-Ov c < 



290192-18 

See D.C. Characteristics Table for Current and Capaci- 
tance Specifications. 
D1 and D2 are matched 
D3 and D4 are matched 
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A.C. TESTING WAVEFORM— SYNCHRONOUS INPUTS AND OUTPUTS 



CLK2 



INPUT (SETUP 
AND HOLD) 



OUTPUTS 



3.0 



2.4 



1.5V 



/ 



C 



0.4 f \ 



2.0 
0.8 



> 



TEST POINTS - 



TEST POINTS 



<ZK 



x> 



•1.5V 



TEST POINTS 



290192-10 

A.C. Testing: Inputs are driven at 2.4V for a Logic "1" and 0.4V for a Logic "0". CLK2 is driven at 3.0V for a Logic "1" 
and 0.45V for a Logic "0". Timing Measurements made relative to CLK2 are made from 1.5V on CLK2. Inputs and 
outputs are measured at 2.0V for a high and 0.8V for a low. Device input rise and fall times are less than 3 ns. 



A.C. TESTING WAVEFORM— ASYNCHRONOUS INPUTS AND OUTPUTS 



2.4 



0.4 



OUTPUTS 



X 



> 



TEST POINTS 



A^> 



TEST POINTS 



290192-11 

A.C. Testing: Inputs are driven at 2.4V for a Logic "1" and 0.4V for a Logic "0". Input timing is measured at 1.5V for 
high : to-low and low-to-high transitions. Outputs are measured at 2.0V for a high and 0.8V for a low. Device input rise and 
fall times are less than 3 ns. 
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A.C. CHARACTERISTICS (T A = o°C to +70°C,V C c = 5.0V ±5%) 


Symbol 


Parameter 


85C960-25 


85C960-20 


Units 


Min 


Max 


Min 


Max 


t-|(©) 


Input Setup to CLK2a 


12 




15 




ns 


t 2 ( 6 ) 


Input Hold from CLK2a 


2 




2 




ns 


t 3 


CLK2a to A2, A3 Valid Delay 





8 





10 


ns 


U 


CLK2c to RDY# Output Low Delay 




10 




15 


ns 


t 5 (7) 


CLK2c to RDY# Output High Delay 




10 




15 


ns 


te 


CLK2atoCS0#-CS3# High Delay 


5 


40 


5 


50 


ns 


t? 


CLK2a to BLAST# Low Delay 




20 




20 


ns 


t 8 


CLK2a to BLAST # High Delay 


5 




5 




ns 


t 9 (8) 


CLK2b to WCLK# Low Delay 





10 





12 


ns 


tio< 8 > 


CLK2d to WCLK# High Delay 





10 





12 


ns 


tl1 0) 


ADS# Low to CS0#-CS3# Low Delay 




10 




12 


ns 


t 12 0) 


CLK2c to CSO # -CS3 # Low Delay 




12 




15 


ns 


t 13 d0) 


I0-I7 Setup to CLK2a 


5 




7 




ns 


t 14 d0) 


I0-I7 Hold from CLK2a 


2 




2 




ns 


t 15 (1D 


I0-I7 Valid to CS0#-CS3# Valid Delay-(t PD ) 




10 




12 


ns 


tl6 


RDY# Input Setup to CLK2d (Write) 


7.5 




10 




ns 


tl7 


RDY# Input Setup to CLK2a (Read) 


9 




9 




ns 


tl8 


RDY# Input Hold after CLK2a (Read/Write) 


5 




10 




ns 


t 19 02) 


RESET High Setup to CLK2 1 












ns 


t 20 (13) 


RESET High Hold from CLK2T 


3 




3 




ns 


t 2i (12) 


RESET Low Setup to CLK2a 


5 




5 




ns 



NOTES: 

6. Applies to ADS#, DEN#, W/R#, and AD0-AD3. DEN# is high during the entire Ta state in 80960 KA/KB systems. 

7. RDY# is an open-drain output. Specified time includes RDY# output float delay and pull-up/ pull-doWn resistors 
(Figure 8). RDY# remains low for a minimum of 10 ns at the start of a Tr state and goes high by CLK2a of the next Tx state. 

8. Minimum WCLK# pulse width is one clock period minus 3 ns. For example, at 25 MHz: 20 ns — 3 ns '= a 17 ns minimum 
WCLK# pulse. 

9. Chip Select Decoder latches are transparent flow-through types. Latches open when ADS# is low, DEN# is high, and 
CLK2 goes high during the middle of a Tx state (CLK2c). Since DEN# is high during the entire Ta state in 80960 KA/KB 
systems, only CLK2c and ADS# are specified. 

10. Chip Select Decoder latches are transparent flow-through types. Latches close when ADS# is high or DEN# is low, or 
when CLK2 goes high at the start of a Tx state (CLK2a) after the latches have opened. Since ADS# is low and DEN# is 
high at the end of a Ta in 80960 KA/KB systems, setup and hold times are specified with reference to CLK2a only. 

11. Propagation delay while latches are open (transparent); one output switching (high-to-low). 

12. RESET must be held high for a minimum of 4 CLK2 cycles (80960 specifies 41 CLK2 cycles minimum). 

13. RESET must hold after the low-to-high transition immediately prior to CLK2a. CLK2a is defined as the first low-to-high 
transition after RESET goes low. 
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CLK2 EDGES 




CLK2 



NOTE: 

Minimum CLK2 high and low times are 8 ns measured from 1.5V to 1.5V. 
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CAPACITANCE (T A = 0°Cto + 70°C; V C c = 


= 5.0V ± 


5%) 






Symbol 


Parameter 


Min 


Typ 


Max 


Unit 


Conditions 


C|N 


Input Capacitance 




6 


10 


pF 


V| N = 0V,f = 1.0 MHz 


GOUT 


Output Capacitance 




6 


10 


PF 


V UT= OV.f = 1.0 MHz 


CCLK 


CLK2 Capacitance 




6 


10 


PF 


V| N = 0V,f = 1.0 MHz 


C V pp 


Vpp Pin Capacitance 




10 


25 


PF 


Vpp on Pin 1 (RESET) 


Crdy 


RDY# Capacitance 




6 


10 


PF 


V UT = 0V,f = 1.0 MHZ 
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WCLK# TIMING 




10-17 AND CS0#-CS3# TIMING 



CLK2 



ADS 



DEN 



AD0-AD3 



CS0-CS3 



I 



-J© 



-C5) 



n 



Tw/Td 



M 



© 

DC 



II 



NOTE: 

CLK2, ADS#, and DEN# generate internal latch enable. See Figure 7 for details. 
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CLK2 



ADS 



DEN 



AD0-AD3 



10-17 



W/R 



CS0-CS3 



A2.A3 



WCLK 



RDY 



BLAST 



3 Word Burst with Wait States on Each Access 

RDY # is Generated Externally 

(WCLK# is Only Generated During Burst Write Transactions) 

Td Td Td Tr 



ho- 



\ 



© 



§C 



©- 



1 



r® 



1 



© 



<_ 



© 



1 



m 



K 



©h 



V 



WMK 



\ 



X 



-@ 



©h 



w 



j 



"i 



T 



% 



J 



RESET INPUT TIMING 



CLK2 



RESET 





4 CLK2 CYCLES 
(MINIMUM) 
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27960CX 

PflPELSIMED BURST ACCESS 1M (128K x 8) CHMOS EPROM 




m Synchronous 4 Byte Data Burst Access 
El No Glue Interface to 80960CA 

m High Performance Clock to Data Out 

— Zero Wait State Data to Data Burst 

— Up to 33 MHz 80960CA Performance 

n Asynch Microcontroller Reset Function 

— Returns to Known State with High-Z 
Outputs 



m 



m 



Pipelined Addressing for Optimal Bus 
Bandwidth on 80960CA 

— Next Addressing Overlaps Last Data 
Byte 

CHMOS lll-E for High Performance and 
Low Power 

— 125 m A Active, 30 m A Standby 

— TTL Compatible Inputs 

1 Mbit Density Configures as 128K x 8 



Intel's 27960CX is a 5V only, 1,048,576 bit, Erasable Programmable Read Only Memory, organized as 128K 
words of 8 bits. 

The 27960CX provides a no glue synchronous burst interface to the 80960CA bus. Internally the 27960CX is 
organized in 4 byte blocks, in which each byte is accessed sequentially. The internal state machine is factory 
configured to generate either 1 or 2 wait-states between the address and first data byte. High performance 
outputs provide zero wait-state data to data accesses at clock frequencies up to 33 MHz. 

Pipelining capability allows addresses to overla p previo us data, further optimizing bus bandwidth in 80960CA 
applications. An asynchronous microcontroller RESET feature puts the outputs in the high impedance state 
and takes the internal state machine to a known state where a new burst access can begin. 

The 27960CX is available in 44-lead PLCC package, providing optimum cost effectiveness. 

The 27960CX is manufactured on Intel's 1 micron CHMOS lll-E technology. The Quick-Pulse ProgrammingTM 
algorithm provides fast, reliable programming with throughput under 17 seconds for optimized equipment. 

*CHMOS is a Patented Process of Intel Corporation. 
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Figure 1. 27960CX Burst EPROM Blpck Diagram 
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27960CX BURST EPROM 

EPROMs are established as the preferred code stor- 
age device in embedded applications. The non-vola- 
tile, flexible, reliable, cost effective EPROM makes a 
product easier to design, manufacture and service. 
Until recently, however, EPROMs could not match 
the performance needs of high-end systems. The 
27960CX was designed to support the 80960CA em- 
bedded processor. It utilizes the burst interface to 
offer near zero wait-state performance without the 
high cost normally associated with this performance. 

In embedded designs, board space and cost must 
be kept at a minimum without impacting perform- 
ance and reliability. The 27960CX removes the need 
for expensive high-speed shadow RAM backed up 
by slow EPROM or ROM for non-volatile code stor- 
age. Code optimization concerns are reduced with 
"off-chip" code fetches no longer crippling to sys- 
tem performance. FONTs can be run directly out of 
these EPROMs at the same performance as high- 
speed DRAMs. With the 27960CX, the EPROM is 
the ideal code or FONT storage device for your 
80960CA system. 



"CERQUAD is available in a socket only version. 



Architecture 

The 27960CX provides a no-glue, synchronous burst 
interface to the 80960CA's bus. It operates in pipe- 
lined or non-pipelined modes. Internally, the 
27960CX is organized in 4 byte blocks which are 
accessed sequentiall y. A b urst access begins on the 
first clock pulse after ADS and CS are asserted. The 
address of the 4 byte b lock is latched on the rising 
edge of clock following ADS. After a preset number 
of wait-states (1 or 2), data is output one byte at a 
time on each subsequent clock cycle. A burst ac- 
cesses terminated on the rising edge of clock with 
BLAST asserted. High performance outputs provide 
zero wait-state data to data accesses at clock fre- 
quencies up to 33 MHz. Extra power and ground 
pins dedicated to the outputs reduce the effects of 
fast output switching on device performance. 

The pipelining capability of the 27960CX allows the 
address to overlap the last data byte of the burst, 
further optimizing bus band width in 80960CA appli- 
cations. In the pipelined mode, with a non-buffered 
interface, the 27960CX delivers 4 bytes of data in 
6 clock cycles at 33 MHz. In a 32-bit configuration, 
this translates into a read bandwidth of 88 Mbytes/ 
sec. Performance capability of the 27960CX in dif- 
ferent 80960CA systems is given in Table I. 
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Figure 2. 27960CX Burst EPROM Signal Set 
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Table 1. Performance Capability 







33 MHz 2WS Non-Buffered: 4 Words/6 


Clock Cycles - 


-> 88 Mbytes/Sec 
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WS 
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— 


— 
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A 1 


WS 


WS 


— 


— 
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A 2 


DATA 


— 


— 


— 


Doo 


D 1 


D 2 


D 3 


— 


— 


D10 


D11 


D 12 
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PCLK 
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C 2 
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c 5 


c 6 


c 7 
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c 3 
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c 5 


c 6 



25 MHz 2 WS Buffered: 4 Words/6 Clock Cycles -» 66 Mbytes/Sec 
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Figure 3. 27960CX 44 Lead PLCC Pinout 
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PIN DESCRIPTIONS 



Symbol 


Pin 


Function 


A -A 16 


23-39 


ADDRESS INPUTS: During a burst operation, A2-A16 provides the 
base address pointing to a block of four consective bytes. Aq and A-\ 
select the first byte of the burst access. The 27960CX latches 
addresses in the first clock cycle. An internal address generator 
increments addresses Aq and A1 for subsequent bytes of the burst. 


D -D 7 


18,17,14, 

13,11,10, 

7,6 


DATA INPUTS/OUTPUTS 


ADS 


42 


ADDRESS STROBE: Indicates the start of a new bus access. ADS is 
active low in the first clock cycle of a bus access. 


CS 


3 


CHIP SELECT: Master device enable. When asserted (active low) 
data can be written to and read from the device. In read mode, CS 
enables the state machine and the I/O circuitry. 

NOTE: 

1 . The address decode path is independent of CS, i.e., X and Y 
decoding is always powered up. 

2. For programming, CS should remain low for the entire cycle. 
Program and verify functions are done one byte at a time. 

3. CS going high does not terminate a concurrent burst cycle. 




1 


BURST LAST: Terminates a concurrent burst data cycle at the rising 
edge of the CLK. It must be asserted by the fourth data byte. 


BLAST 




22 


RESET: Resets the state machine into a known state, tri-states the 
outputs. RESET must be asserted for a minimum of 1 clock cycles. At 
least 5 clock cycles are required after deassertion of RESET before 
beginning the next cycle. RESET will abort a concurrent bus cycle. 


RESET 


PGM 


43 


PROGRAM-PULSE CONTROL INPUT 


Vpp 


2 


PROGRAMMING POWER SUPPLY 


Vss 


5,8,12, 
15,19,21 


GROUND 


Vcc 


9,16,20,44 


SUPPLY VOLTAGE INPUT 
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INTERFACE EXAMPLE 



Overview 

This example illustrates 8-, 16- and 32-bit wide 
27960CX interfaces to the 80960CA. The designs 
offer a simple "no-glue" interface. 

A non-buffered 27960CX system organized as 256K 
x 32 is shown in Figure 4A. Since the 27960CX is 
capable of driving a 80 pF load, large, non-buffered 
systems can be implemented by stacking up to 2 
banks of 4 EPROMs, resulting in a 256K x 32 memo- 
ry subsystem. The input capacitive load seen 



on the address lines (due to the EPROM only) is 
24 pF for a 1 28K x 32 system and 48 pF for a 256K x 
32 system. The EPROM is specified at 6 pF for input 
capacitance (15 pF max) and 12 pF typical for out- 
put capacitance. Larger systems can be implement- 
ed with buffers (Figure 4B). 

Chip Select Logic 

High order address lines are decoded to provide CS. 
Qualification with other signals is not required. The 
chip select logic can be implemented with standard 
asynchronous decoders, PAL's or PLD's (like Intel's 
85C508). 
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Figure 4A. 256K x 32 Non-Buffered Burst EPROM Memory System 
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Figure 4B. Buffered Burst EPROM Memory System 
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Schematics 

Figure 5 shows a non-buffered, 128K x 32 27960CX 
EPROM system. 

Chip select logic, the only external logic that is re- 
quired for this interface, can be derived from the 
global system chip select circuitry. 



In a non-buffered, 16-bit system (Figure 6A) BE1 
and A2 conn ect to the lower order address bits of 
the 27960CX. BE1 connects to A of both EPROMs, 
while A2 connects to both Ai's. 

In a non-buffered, 8-bit system (Figure 6B) BEO and 
BE1 connect to Aq and A1 respectively. 
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Figure 5. 128K x 32 27960CX Burst EPROM System 
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Figure 6A. 27960CX Burst EPROM in a 16-Bit System 
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Figure 6B. 27960CX Burst EPROM in a 8-Bit System 



Waveforms 



Figure 7 shows the timing waveforms of a 27960CX 
pipelined read in a 32-bit system. 

CS Setup Time 

CS setup time is the time between CS being assert- 
ed and the first CLK rising edge (during the address 
cycle). Since a memo ry ac cess begins on the first 
CLK rising edge after ADS and CS are asserted, a 
minimum CS setup time of 7 ns (tsvCH) at 33 MHz is 



required. With the 80960CA's maximum valid ad- | 
dress delay of 14 ns at 33 MHz, 9 ns remains for CS W 
decoding logic. 



Bootup 

The wait state configuration (1 or 2), of the 27960CX 
is programmed by the user into the 80960CA Region 
Table parameters of NRAD, NRDD, and NXDA. 
NRDD is always for the 27960CX. 
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NOTES: 

1. The EPROM can also operate in non pipelined mode i.e, next address and ADS can be asserted in the clock cycle 
following the last data word of the burst. 

2. 2-0-0-0 Burst Read — ► 2 indicates the number of wait states to access the first word 

0's indicate the number of wait states for subsequent data words: 
in this case! 



Figure 7. Two Cycles of a 27960CX 2 Wait State 4 Byte Read (2-0-0-0 Burst Read) in a 32 Bit System 



During boot-up (Figure 8), the 80960CA picks up it's 
Region Table data from addresses FFFF FF00; 
FFFF FF04; FFFF FF08 and FFFF FFOC. Only the 
least significant byte of each of the above four 32-bit 
accesses is used to configure the Region Table. For 
boot-up, the wait-state parameters NRAD and NXDA 
default to 31 and 3 respectively. During boot-up, the 
27960CX will wrap around the first word o f the fo ur- 
word burst and hold the first word until BLAST is 
asserted. 

27960CX DEVICE NAMES 

The device names on the 27960CX were derived as 
mnemonics that correspond to the number of wait 
states and expected operating frequency for the de- 
vice. For example, the 25 MHz, 2 wait state 
27960CX is named 27960C2-25. 



AC TIMING DERIVATIONS 

The AC timings for the 27960CX were generated 
specifically to meet the requirements of the 
80960CA microprocessor. In each case the applica- 
ble 80960CA clock frequency and AC timing were 
taken together with an address buffer delay (if heed- 
ed) and a typical 2 ns guardband to generate the 
27960CX AC timing. Worst case timings were 



always assumed. On timings where the EPROM is 
faster than the microprocessor, we specified the 
time required by the EPROM and left the excess 
time as additional system guardband. The example 
below shows how the 27960C2-33 tavcnh timing 
was derived. 

@33 MHz the clock cycle is ~30 ns. 
tov2 of the 80960CA is 3 ns . - 14 ns. 
Typical 2 ns guardband. 



27960C2-33 tavc h 



30 ns - 14 ns 
14 ns 



2 ns 



Decoders are needed for the systems chip select 
decoding. For the 27960CX timings we assumed a 
10 ns chip select decoder for 16 MHz and a 7 ns 
decoder for 25 MHz and 33 MHz systems. The ex- 
ample below shows how the 27960C2-33 tsvch tim- 
ing was derived. 

@33 MHz the clock cycle is ~ 30 ns. 
tov2 of the 80960CA is 3 ns - 14 ns. 
Decoder = 7 ns 



27960C2-33 tsvch = 30 ns 
= 9 ns 



14 ns — 7 ns 
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Figure 9. 27960CX Burn in Biasing Diagram 



System Buffering Considerations 

For large system applications buffering may be re- 
quired between the microprocessor and memory de- 
vices. The 25 and 16 MHz 27960CX AC timings take 
this into account. For applications not requiring buff- 
ering these devices will provide additional system 
guardband. , 

The list below shows the buffers used in generating 
the 27960CX timings: 



Note that the 25 MHz buffers are slightly faster in 
keeping with the increased sensitivity for higher per- 
formance. Significantly faster buffers are available 
for applications requiring them. The example below 
shows the tchqv timing analysis for a buffered 
27960C2-25. 

@25 MHz the clock cycle is ~ 40 ns. 
t|Hi of the 80960CA is 5 ns. 
Output buffer for 25 MHz = 5 ns 





Input 


Output 


27960C2-25 t C HQV = 40 ns - 


- 5 ns - 


- 5 ns 




Buffer 


Buffer 


= 30 ns 






25 MHz 


8 ns 


5 ns 








16 MHz 


10 ns 


7 ns 
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ABSOLUTE MAXIMUM RATINGS" 

Read Operating Temperature . 0°C to + 70°C( 8 ) 

Case Temperature Under Bias . . - 1 0°C to + 80°C( 8 ) 

Storage Temperature -65°C to + 125°C 

All Input or Output Voltages 

with Respect to Ground -0.6V to +6.5V( 4 ) 

Voltage on Ag 

with Respect to Ground -0.6V to + 13.0V(4) 

Vpp Supply Voltage 

with Respect to Ground -0.6V to + 14.0V(4) 

Vcc Supply Voltage 

with Respect to Ground -0.6V to + 7.0V(4) 



NOTICE: This data sheet contains preliminary infor- 
mation on new products in production. The specifica- 
tions are subject to change without notice. Verify with 
your local Intel Sales office that you have the latest 
data sheet before finalizing a design. 



* WARNING: Stressing the device beyond the "Absolute 
Maximum Ratings" may cause permanent damage. 
These are stress ratings only. Operation beyond the 
"Operating Conditions" is not recommended and ex- 
tended exposure beyond the "Operating Conditions" 
may affect device reliability. 



READ OPERATION 



DC CHARACTERISTICS 0°C < T A + 70°C, V C c = 5V ± 10%, TTL Inputs 



Symbol 


Parameter 


Notes 


Min 


Max 


Unit 


Test Condition 


"LI 


Input Load Current 






1 


JLlA 


V| N = 5.5V 


lLO 


Output Leakage Current 






10 


jllA 


V UT = 5.5V 


Ipp 


Vpp Load Current Read 






10 


juiA 


Vpp = to V C c, PGM = V| H 


•SB 


Vcc Standby 


Switching 


2 




45 


mA 


CS = V| H ,f = 33 MHz 


Stable 


2 




30 


mA 


CS = V, H 


•cc 


Vcc Active Current 


1,3,7 




125 


mA 


CS = V|l, f = 33 MHz, 
Iqut = mA 


V|L 


Input Low Voltage 


4 


-0.5 


0.8 


V 




V| H 


Input High Voltage 




2.0 


v C c+ 1 


V 




Vol 


Output Low Voltage 






0.45 


V 


Iol = 2.1 mA 


V H 


Output High Voltage 


5 


V C c " 0.8 




V 


lOH = "IOOjulA 
Iqh = -400jixA 


5 


2.4 




V 


los 


Output Short Circuit 


6 




100 


mA 





1 



NOTES: 

1 . Maximum current is with outputs unloaded. 

2. Ice standby current assumes no output loading i.e., Ioh = Iol = ° mA - 

3. Ice is the sum of current through Vcc3 + Vcc4 and does not include the current through Vcci and Vcc2- ( v cci and 
Vcc2 supply power to the output drivers. Vcc3 and Vcc4 supply power to the reset of the device.) 

4. Minimum DC input voltage on input and output pins is -0.5V. During transitions, this level may undershoot to -2.0V for 
periods less than 20 ns. 

5. Maximum DC voltage on input and output pins is Vcc + °-5V which may overshoot to Vcc + 2.0V for periods less than 
20 ns. 

6. One output shorted for no more than one second, los is sampled but not 100% tested. 

7. Ice max measured with a 1 0.1 1 jaF capacitor between Vcc and Vss- 

8. This specification defines commercial product operating temperatures. 
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EXPLANATION OF AC SYMBOLS 

The nomenclature used for timing parameters are as 
per IEEE STD 662-1980 IEEE Standard Terminology 
for Semiconductor Memory. 

Each timing symbol has five characters. The first is 
always a "t" (for time). The sec ond c haracter repre- 
sents a signal name, e.g., (CLK, ADS, etc.). The third 
character represents the signal's level (high or low) 
for the signal indicated by the second character. The 
fourth character represents a signal name at which a 
transition occurs marking the end of the time interval 
being specified. 



The fifth character represents the signal level indi- 
cated for the fourth character. The list below shows 
character representations. 



A: 


Address 


R: 


Reset 


B: 


BLAST 


Q: 


Data 


C: 


Clock 


S: 


CS 


H: 


Logic High Level 


t: 


Time 


L: 


ADS/ Logic Low Level 


V: 


Valid 



P: Vpp Programming Voltage Z: Tri-state Level 
X: No longer a valid "driven" logic level 



AC CHARACTERISTICS: READ OPERATION 


0°C <T A < + 70°C,V C c 


= 5V ± 


10% 




Versions 


27960C2-33 


27960C2-25 


27960C1-16 


Unit 


33 MHz 
2 Wait State 


25 MHz 
2 Wait State 


16 MHz 
1 Wait State 


No. 


Symbol 


Parameter 


Notes 


Min 


Max 


Min 


Max 


Min 


Max 


1 


*AVC H 


Address Valid to 
CLK High 


CLK 


12 




10 




22 




ns 


2 


tC N HAX 


CLK High to 
Address Invalid 


2 

















ns 


3 


tLLCH 


ADS low to CLK High 


CLK 


8 




8 




22 




ns 


4 


tCHLH 


CLK high to ADS High 


5 


6 


22 


6 


32 


6 


40 


ns 


5 


tSVCH 


CS Valid to 
CLK High 


1 


7 




7 




14 




ns 


6 


tC N HSX 


CLK High to 
CS Invalid 


2 

















ns 


7 


tCHQV 


CLK High to Data Valid 


7 




27 




30 




40 


ns 


8 


*CHQX 


CLK High to Data Invalid 




5 




5 




5 




ns 


9 


tCHQZ 


CLK High to Data High Z 


6 




25 




30 




30 


ns 


10 


*BVCH 






8 




8 




22 




ns 


BLAST Valid to 
CLK High 


11 


tCHBX 


CLK High to 
BLAST Invalid 


3 


5 


22 


5 


32 


5 


40 


ns 



NOTES: 

1. Valid signal level is meant to be either a logic high or logic low. 

2. The subscript N represents the number of wait states for this parameter. CS can be de-asserted (high) after the number 
of wait states (N) has expired and the EPROM will continue to burst out data for the current cycle. 

3. BLAST # must be returned high before the next rising clock edge. 

4. The sum of tcHQV + tAvcH + N CLK will n °t equal actual tAVQV if independent test conditions are used to obtain t/\vcH 
andjcHQv (N = number of wait "states). 

5. ADS must be returned high before the next rising clock edge. 

6. Sampled, not 100% tested. The transition is measured ±500 mV from steady state voltage. 

7. For capacitive loads above 80 pF, tcHQV can De derated by 1 ns/20 pF. 
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Figure 10. 27960CX Pipelined 2 Wait State AC Waveforms 
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AC CONDITIONS OF TEST 

Input Rise and Fall Times 
(10% to 90%).. 

Input Pulse Levels 



4 ns Input Timing Reference Level . . 

. 0.45V to 2.4V Output Timing Reference Level 

Table 2. Mode Table 



.1.5V 
.1.5V 



Mode 


CS 


PGM 


BLAST 


ADS 


RESET 


A 9 


V PP 


Vcc 


OUTPUT 


Read 


V|L 


V|H 


V|H<D 


V| H <2> 


V|H 


X 


Vcc 


Vcc 


DoUT 


Standby(6) 


V|H 


X 


X 


X 


V| H 


X 


Vcc(5) 


Vcc 


HighZ 


Program 


V|L 


V|L 


V|H 


V, H (2) 


V|H 


X 


(3) 


(3) 


Din 


Program Verify 


V|L 


V|H 


V| H (D 


V|H 


V|H 


X 


(3) 


(3) 


Dour 


Program Inhibit 


V|H 


X 


X 


X 


V,H 


X 


(3) 


(3) 


HighZ 


ID Byte 0: Manufacturer 


V|L 


V|H 


V|H<D 


V| H (2> 


VlH 


V| D (3) 


Vcc 


Vcc 


89H 


ID Byte 1: Part (27960) 


V|L 


V|H 


V| H (D 


V|H< 2 > 


V|H 


V| D (3) 


Vcc 


Vcc 


EOH 


ID Byte 2: CX 


V|L 


V|H 


V IH (D 


V lH (2) 


V|H 


V| D (3) 


Vcc 


Vcc 


01 B 


JD Byte 3: 1 Wait State 
2 Wait States 


V|L 


V|H 


V| H (D 


V, H (2) 


V|H 


v ID ( 3 ) 


V CC 


Vcc 


01B 
10B 


Reset 


X 


X 


X 


X 


V|L 


X 


Vcc 


Vcc 


HighZ 



NOTES: 

1 . V|h until data terminated at which time BLAST must go to V|l. 

2. Need to toggle from Vih to V||_ to Vm- 

3. See DC Programming Characteristics for Vcc. V ID and Vpp voltages. 

4. X can be V||_ or V|h- 

5. Vpp = Vcc to meet standy current specification. Vcc > V PP > Vi l will ca use a slight increase in standby current. 

6. The device must be in the idle state (by asserting RESET or using BLAST) before going into standby. 



CAPACITANCEO) T a = 25°C, f = 1.0 MHz 



Symbol 


Parameter 


Typ 


Max 


Unit 


Condition 


C|N 


Input Capacitance 


4 


6 


pF 


V| N = ov 


C OUT 


Output Capacitance 


12 


15 


PF 


v ut = ov 


C V pp 


Vpp Capacitance 


40 


45 


pF 


V, N = 0V 



NOTE: 

1. Sampled. Not 100% tested. 
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AC INPUT/OUTPUT REFERENCE WAVEFORMS 



AC TESTING LOAD CIRCUIT 



VOH 

INPUT 

VOL 



Lj£ 



TIMING PARAMETER 




1.5V 



•VOH 



OUTPUT 



290236-14 

Input and output timings are measured from 1.5V. 
Timing values are specified assuming maximum input 
and output rise and fall time = 4 ns. 







2.1V 
S 780X1 




DEVICE 
UNDER 
TEST 


6 




I-pI CL = 8U ph 

290236-15 

CL includes jig capacitance 

For t C HQZ C L = 5 pF and R L = 405ft 



CLOCK CHARACTERISTICS 



Versions 


33 MHz 


25 MHz 


20 MHz 


16 MHz 


Units 


Symbol 


Parameter 


Min 


Max 


Min 


Max 


Min 


Max 


Min 


Max 


CLK 


Period 


30.3 




40 




50 




62.5 




ns 


tpR 


Rise Time 


1 


4 


1 


4 


1 


4 


■1 


4 


ns 


tpF 


Fall Time 


1 


4 


1 


4 


1 


4 


1 


4 


ns 


tpL 


Low Time 


(t/2) - 2 


t/2 


(t/2) - 3 


t/2 


(t/2) - 4 


t/2 


(t/2) - 4 


t/2 


ns 


tpH 


High Time 


(t/2) - 2 


' t/2 


(t/2) - 3 


t/2 


(t/2) - 4 


t/2 


(t/2) - 4 


t/2 


ns 


Max Rise Time for Programming CLK = 1 00 ns 



CLOCK WAVEFORM 
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Program/Program Verify 

Initially, and after each erasure, all bits of the 
EPROM are in the "1's" state. Data is introduced by 
selectively programming "O's" into the desired bit 
locations. Although only "O's" can be programmed, 
both "1's" and "O's" can be present in the data 
word. Ultraviolet erasure is the only way to change 
"O's" to "1's". 

Programming mode is entered when Vpp is raised to 
12.75V. Program /Verify operation is synchronous 
with the clock and can only be initiated following an 
idle state. Program and Program Verify take place in 
3 clock cycles. In the first clock cycle, addresses 
and data are input and programming occurs. Pro- 
gram Verify follows in the second clock cycle and 
the third clock cycle terminates synchronous Pro- 
gram/Verify operation, returning the state machine 
to the idle state with outputs at high impedance. 

As in the Read mode, A2-A16 point to a four byte 
block in the memory array. During programming, the 
internal address increment circuitry is disabled and 
the programmer must supply A and A1 to point to 
an individual byte within the four byte block that is to 
be programmed. Only one byte is programmed in 
each 3 cycle Program/Verify sequence. 



Program Inhibit 

The Program Inhibit mode allows parallel program- 
ming and verification of multiple devices with differ- 
ent data. With Vpp at 1 2.75V, a Program/Verify se- 
quence is initiated for any device that receives a val- 
id ADS p ulse a nd rising clock edge while CS is as- 
serted. A PGM pulse programs data in the first cycle 
of the sequence and data for Program Verify is out- 
put in the second cycle. The Program/Verify se- 
quence is inhibited on any devices for which CS is 
not asserted. Data will not be programmed and the 
outputs will remain in their high impedance state. 



int e ligent Identifier™ Mode 

The device's manufacturer, product type, and con- 
figuration are stored in a four byte block that can be 
accessed by using the int e ligent Identifier™! mode. 



The programmer can verify the device identifier and 
choose the programming algorithm that corresponds 
to the Intel 27960CX. The int e ligent Identifier can 
also be used to verify that the product is configured 
with the desired Read mode options for wait states. 

int e ligent Identifier mode is entered when Ag (pin 32) 
is raised to its high voltage (V|d) level. The internal 
state machine is then set for intelligent Identifier 
Read operation. Reading the identifier is similar to a 
Read operation on a one wait state configured prod- 
uct. Up to four bytes can be read in a single burst 
access. int e ligent Id entifier read is terminated by a 
synchronous BLAST input, returning the state ma- 
chine to the idle state with outputs at high imped- 
ance. 

The four byte block code for the int e ligent Identifier 
code is located at address 00H through 03H and is 
encoded as follows: 



MEANING 

Intel ID 


(A1,A0) 

Byte 00 


DATA 

89h 


27960 
CX 


Byte 01 
Byte 10 


EOh 
01b 


1 Wait State 

2 Wait States 


Byte 1 1 
Byte 11 


01b 
10b 


RESET MODE 







Due to the synchronous nature of the 27960CX, the 
various operating modes must be initiated from a 
known idle state. During normal operation, the inter- 
nal state machine returns to an idle sta te at the ter- 
mination of a bus access (after BLAST is asserted). 

During initial device power up, the state machine is 
in an indeterminant state. The reset mode is provid- 
ed to force operation into the idle state. Reset mode 
is entered when the RESET pin is asserted. Output 
pins are asynchronously set to the high impedance 
state and address latches are put into the flow 
through mode. A reset is successfully completed 
and the state machine set in an idle state when 
RESET has been asserted for a minimum of 10 
clock cycles and deasserted for five clock cycles. 
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INCREMENT ADDRESS )< C ADDRESS? ><- 



A V CC = 5.0V A 
^ V pp = 12.75V J 



DEVICE 
FAILED 
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Figure 11. Quick-Pulse ProgrammingTM Algorithm 



4-35 



iny 



27960CX 



iPisiyBaDNAinv 



QUICK-PULSE PROGRAMMINGTM 
ALGORITHM 

The Quick-Pulse Programming algorithm programs 
Intel's 27960CX. Developed to substantially reduce 
programming throughput time, this algorithm allows 
optimized equipment to program a 27960CX in un- 
der 17 seconds. Actual programming time depends 
on the programmer used. 

The Quick-Pulse Programming algorithm uses a 
100 jus pulse followed by a byte verification to deter- 



mine when the addressed byte is correctly pro- 
grammed. The algorithm terminates if 25 100 /as 
pulses fail to program a byte. Figure 1 1 shows the 
27960CX Quick-Pulse Programming algorithm flow- 
chart. 

The entire program-pulse/byte-verify sequence is 
performed with Vcc = 6.25V and Vpp = 12.75V. 
The program equipment must establish Vcc before 
applying voltages to any other pins. When program- 
ming is complete, all bytes should be compared to 
the original data with Vcc = 5 -0V an d Vpp = 
12.75V. 



D.C. PROGRAMMING CHARACTERISTICS T A = 25° 


±5°C 






Symbol 


Parameter 


Notes 


Min 


Max 


Unit 


Condition 


Ili 


Input Load Current 






10 


juA 


V|N = V| H orV, L 


'cc 


Vcc Program Current 


1 




125 


mA 


CS = V|l 


Ipp 


Vpp Program Current 


1 


■> 


50 


mA 


CS = V, L 


V|L 


Input Low Voltage 




-0.5 


0.8 


V 




V| H 


Input High Voltage 




2.0 


Vcc + 0.5 


V 




Vol 


Output Low Voltage(Verify) 






0.40 


V 


I.OL = 2.1 mA 


Voh 


Output High Voltage(Verify) 




Vcc - 0.8 




V 


Iqh = -400 juA 


V|D 


Ag int e ligent Identifier 
Voltage 




11.5 


12.5 


V 




Vcc 


Supply Voltage (Program) 


2 


6.0 


6.5 


V 




Vpp 


Program Voltage 


2 


12.5 


13.0 


V 





NOTES: 

1. The maximium current value is with outputs unloaded. 

2. Vcc must be applied simultaneously or before Vpp and removed simultaneously or after Vpp. 

3. During programming clock levels are V|h and V||_. 
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A.C. PROGRAMMING, RESET AND ID CHARACTERISTICS T A = 25°C ±5°C 


No. 


Symbol 


Parameter 


Notes 


Min 


Max 


Unit 


1 


tAVPL 


Address Valid to PGM Low 




2 




JLiS 


2 


tCHAX 


CLK High to Address Invalid 




50 




ns 


3 


tl_LCH 


ADS Low to CLK High 


1 


50 




ns 


4 


tCHLH 


CLK High to ADS High 


2 


50 




ns 


5 


tSN/CH 


CS Valid to CLK High 




50 




ns 


6 


tCHSX 


CLK High to CS Invalid 


3 






ns 


7 


tCHQV 


CLK High to Dqut Valid 




100 




ns 


8 


tCHQX 


CLK High to Dout Invalid 









ns 


9 


tBVCH 


BLAST Valid to CLK High 




50 




ns 


10 


tCHBX 


CLK High to BLAST Invalid 


4 


50 




ns 


11 


tQVPL 


DATA Valid to PGM Low 




2 




JLLS 


12 


tPLPH 


PGM Program Pulse Width 




95 


105 


JLtS 


13 


tPHQX 


PGM High to D iN Invalid 




2 




julS 


14 


*CLPL 


CLK Low to PGM Low 




50 




ns 


15 


tQZCH 


D| N Tri-State to CLK High 




2 




JLLS 


16 


tvcs 


Vcc Program Voltage to CLK High 


7 


2 




JLtS 


17 


tvps 


Vpp Program Voltage to CLK High 


7 


2 




JULS 


18 


UgHCH 


A 9 V, D Voltage to CLK High 




2 




JLtS 


19 


tCHAgX 


CLK High to A 9 Not V| D Voltage 




2 




JLtS 


20 


tRVCH 




6 


50 




ns 


RESET Valid to CLK High 


21 


tCHCL 


CLK High to CLK Low 


5 


100 




ns 


22 


tdCH 


CLK Low to CLK High 


5 


100 




ns 



iiS^i''? 




mm 


SR} ; 







NOTES: 

1. If CS is low, ADS can go low no sooner than the falling edge of the previous CLK. 

2. ADS must return high prior to the next rising edge of clock. 

3. CS mus t remain low until after the rising edge of CLK1 . 

4. BLAST must return high prior to the next rising edge of CLK. 

5. Max CL K rise/fall time is 100 ns. 

6. RESET must be low for 10 clock cycles and high for 5 clock cycles. 

7. Vcc must De applied simultaneously or before Vpp and removed simultaneously or after Vpp. 



4-37 



intel 



27960CX 



^gUIMOKIACW 



© 







< 



® 



ADDRESS 1 



©i© 



-< 



BLAST 



©: 



*^ 



© 



V PP 5V 



V C c 5V 



-©- 



H©- 



-©- 



^/ r 



© 



\y 



i© 



©i© 



«5r}>— GE 



\±J S 



© ©:© 



2^- 



Figure 12. 27960CX Programming Waveforms 
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RESET and int e ligent Identifier Waveforms 
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Figure 13. 27960CX RESET and ID Waveforms 
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27960KX 
BURST ACCESS 1M (128K x 8) CHMOS EPROM 



Synchronous 4-Byte Data Burst Access 
Simple Interface to the 80960KA/KB 

High Performance Clock to Data Out 

— Zero Wait State Data-to-Data Burst 

— Supports 16, 20 and 25 MHz 
80960KA/KB Devices 



Asynch Microcontroller Reset Function 

— Returns to Known State with High Z 
Outputs 

CHMOS* lll-E for High Performance and 

Low Power 

— 125 m A Active, 30 mA Standby 

— TTL Compatible Inputs 

1 Mbit Density Configures as 128K x 8 



Intel's 27960KX is a 5V only, 1,048,576 bit, Erasable Programmable Read Only Memory, organized as 128K 
words of 8 bits. 

The 27960KX provides a simple synchronous burst interface to the 80960KA/KB bus. Internally the 27960KX 
is organized in 4 byte blocks, in which each byte is accessed sequentially. The internal state machine is factory 
configured to generate either 1 or 2 wait-states between the address and first data byte. High performance 
outputs provide zero wait-state data to data accesses at clock frequencies up to 25 MHz. 



An asynchronous microcontroller RESET feature puts the outputs in the high impedance state and takes the 
internal state machine to a known state where a new burst access can begin. 

The 27960KX is available in 44 lead PLCC package, providing optimum cost effectiveness. 

The 27960KX is manufactured on Intel's 1 micron CHMOS lll-E technology. The Quick-Pulse ProgrammingTM 
algorithm provides fast, reliable programming with throughput under 17 seconds for optimized equipment. 

*CHMOS is a patented process of Intel Corporation. 
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Figure 1. 27960KX Burst EPROM Block Diagram 
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27960KX BURST EPROM 

EPROMs are established as the preferred code stor- 
age device in embedded applications. The non-vola- 
tile, flexible, reliable, cost effective EPROM makes a 
product easier to design, manufacture and service. 
Until recently, however, EPROMs could not match 
the performance needs of high-end systems. The 
27960KX was designed to support the 80960KA/KB 
embedded processor. It utilizes the burst interface to 
offer near zero-wait state performance without the 
high cost normally associated with this performance. 

In embedded designs, board space and cost must 
be kept at a minimum without impacting perform- 
ance and reliability. The 27960KX removes the need 
for expensive high-speed shadow RAM backed up 
by slow EPROM or ROM for non-volatile code stor- 
age. Code optimization concerns are reduced with 
"off-chip" code fetches no longer crippling to sys- 
tem performance. FONTs can be run directly out of 
these EPROMs at the same performance as high- 
speed DRAMs. With the 27960KX, the EPROM is 
the ideal code or FONT storage device for your 
80960KA/KB system. 



Architecture 

The 27960KX provides a simple, synchronous burst 
interface to the 80960KA/KB's bus. Internally, the 
27960KX is organized in 4 byte blocks each byte is 
accessed sequentially^ burst access begins on the 
first clock pulse after CS is asserted. The address of 
the f our byte block is latched by the rising edge of 
ALE. After a preset number of wait-states (1 or 2), 
data is output one byte at a time on each subse- 
quent clock cycle. A burst access is terminated on 
the rising edge of CLOCK if BLAST is asserted. High 
performance outputs provide zero wait-state data to 
data accesses at clock frequencies up to 25 MHz. 
Extra power and ground pins dedicated to the out- 
puts reduce the effects of fast output switching on 
device performance. 

The 27960KX delivers 4 bytes of data in 8 clock 
cycles at 25 MHz and 4 bytes of data in 7 clock 
cycles at 20 MHz. In a 32-bit configuration, this 
translates into a read bandwidth of 50 Mbytes/sec 
and 45 Mbytes/sec respectively. Performance capa- 
bility of the 27960KX in different 80960KA/KB sys- 
tems is given in Table 1. 
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Figure 2. 27960KX Burst EPROM Signal Set 
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Figure 3. 27960KX 44-Lead PLCC Pinout 



PIN DESCRIPTIONS 



Symbol 


Pin 


Function 


A -A 16 : 


23-39 


ADDRESS INPUTS: During a burst operation, A2 through A-jg provide the base 
address pointing to a block of four consecutive bytes. Ao and A1 select the first 
byte of the burst access. The 27960KX latches valid addresses in the first clock 
cycle. An internal address generator increments addresses Aq and A1 for 
subsequent bytes of the burst. 


D0-D7: 


18,17,14,13, 
11,10,7,6 


DATA INPUTS/OUTPUTS 


ALE 


42 


ADDRESS LATCH ENABLE: Indicates the transfer of a physical address. ALE 
is an active low signal used to latch the addresses from the processor. 
Addresses are latched on the rising edge of ALE. Valid addresses must be 
present at or before ALE becomes valid. 


CS 


3 


CHIP SELECT: Master device enable. When asserted (active low) data can be 
written to and read from the device. In read mode, CS enables the state 
machine and the I/O circuitry. 

NOTES: 

1 . The address decode path is independent of CS, i.e., X and Y decoding is 
always powered up. 

2. For programming, CS should remain low for the entire cycle. Program and 
verify functions are done one byte at a time. 

3. CS going high does not terminate a concurrent burst cycle. 

4. CS must be deasserted between bursts. 


BLAST 


1 


BURST LAST: Terminates a concurrent burst data cycle at the rising edge of the 
CLK. It must be asserted by the fourth data byte. 


RESET 


22 


RESET: Resets the state machine into a known state, tri-states the outputs. The 
duration of RESET should be 10 CLK cycles minimum. At least 5 clock cycles 
are required after deassertion of RESET before beginning the next cycle. Reset 
will abort a concurrent bus cycle. 
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PIN DESCRIPTIONS (Continued) 



Symbol 


Pin 


Function 


PGM 


43 


PROGRAM-PULSE CONTROL INPUT 


Vpp 


2 


PROGRAMMING POWER SUPPLY V PP 


Vss 


5,8,12, 
15, 19,21 


GROUND 


Vcc 


9,16,20,44 


SUPPLY VOLTAGE INPUT 



Table 1. Performance Capability 



25/20 MHz 2 WS NON-BUFFERED : 4 WORDS/8 CLOCK CYCLES 



50/40 MBYTES/SEC 



ADDR 


Aoo 


WS 


WS 


- 


- 


. 


_ 


RS 


A01 


WS 


WS 


- 


- 


- 


- 


DATA 


- 


- 


- 


Doo 


Doi 


D02 


D 3 


- 


- 


- 


- 


D10 


D11 


D 12 


D13 


CLK 


Ci 


c 2 


c 3 


c 4 


c 5 


c 6 


c 7 


c 8 


Ci 


c 2 


c 3 


c 4 


c 5 


c 6 


c 7 



Aoo 


WS 


- 


- 


- 


- 


RS 


A 1 


WS 


- 


- 


- 


- 


RS 


A 03 


- 


- 


Doo 


Doi 


D 2 


D 3 


- 


- 


- 


D10 


D11 


D 12 


D13 






Ci 


c 2 


c 3 


C 4 


c 5 


c 6 


c 7 


Ci 


c 2 


c 3 


C 4 


c 5 


c 6 


c 7 





20 MHz 1 WS NON-BUFFERED : 4 WORDS/7 CLOCK CYCLES 

ADDR 

DATA 

CLK 

16 MHz 1 WS BUFFERED : 4 WORDS/7 CLOCK CYCLES 

ADDR 

DATA 

CLK 



45 MBYTES/SEC 



36 MBYTES/SEC 



Aoo 


WS 


- 


- 


- 


- 


RS 


Aoi 


WS 


- 


- 


- 


- 


RS 


A 03 


- 


- 


Doo 


D 1 


D 2 


D 03 


- 


- 


- 


D10 


D11 


D 12 


D13. 






Ci 


c 2 


c 3 


C 4 


c 5 


c 6 


c 7 


Ci 


c 2 


c 3 


C 4 


c 5 


c 6 


c 7 





RS 
C 8 

WS 
WS 




INTERFACE EXAMPLE 



Overview 

The following design offers a simple interface to the 
80960KA/KB's bus. 

A non-buffered 27960KX burst EPROM system is 
shown in Figure 4. Since the 27960KX is capable of 
driving a 120 pF load, large, non-buffered systems 
can be implemented by stacking up to 2 banks of 4 
EPROMs, giving a memory size of 256K x 32. The 
input capacitive load seen on the address lines (due 
to the EPROM only) is 24 pF for a 128K x 32 



system (shown) and 48 pF for a 256K x 32 system. 
The EPROM is specified at 4 pF for input capaci- 
tance and 12 pF typical for output capacitance. 
Larger systems can be implemented with buffers. 

Chip Select Logic 

High order address lines are decoded to provide CS. 
Qualification with other signals is not required. The 
chip select logic can be implemented with standard 
asynchronous decoders, PAL's or PLD's (like Intel's 
85C960). 
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NOTE: 

27960KX does not require address latches 



Waveforms 



Figure 4. 128K x 32 Burst EPROM System 

CS Deassert between bursts 



Figure 5 shows the timing waveforms of 27960KX 
reads in a 32-bit system. 



After every EPROM read (one to four words) CS 
must be deasserted. 



CS setup time 

CS setup time is the time between CS asserted and 
the first rising CLK edge of CLK (during the address 
cycle). Since a memory access begins on the first 
CLK rising edge after CS asserted, a minimum CS 
setup time of 5 ns (tsvCH) at 25 MHz is required. 
With the 80960KA/KB's maximum valid address de- 
lay of 18 ns at 25 MHz, 13 ns remains for CS decod- 
ing logic. 



Reset and RESET 



The 27960KX uses RESET. The 80960 KA/KB 
RESET signal must be inverted for the 27960KX. 

Clock Phase 

The initial rising edge of CLK and CLK2 must be in 
phase with as small a skew as possible. 



4-44 



BEll^L 



27960KX 



PO&OIMIIIIMW 



CLK 


A 
CLK 




ws 

1 


D 

N 


D 

N 


D 

M 


D 


RC 

-l1 


A 

N 


ws 

8 


D 
9 


D 
10 


D 
11 


D 

12 


RC 
13 


14 




ADDR 


-( oo xxxxxxxxxxxxxxxxxxxxxxx - xxxxxxxxxxxxxxxxxxxxxxx**: 




































ALE 
CS 


~\J~ 












_r\ 












A 












/o 


OTTO 


ivWtf} 


aVWVo 


7s 






-/W7T 


oWWi 


4VYW 


OT 


cS 










V 


°MM2 


. 4 &S£A° 


MM 


V 






-\2SMl. 


2MM 1 . 


4 Aaaa 1 


8 AAAa 1 


V 






BLAST 












hm- 7- 












~~\^r~ 







NOTES: 

1. 1-0-0-0 Burst Read — ► 1 indicates the number of wait states to access the first word 

O's indicate the number of w ait states for subsequent data words (0 in this case) 

2. 27960KX latches addresses on the rising edge of ALE: it has an internal address generator which increments ad- 
dresses for subsequent words of the burst. 



Figure 5. Two Cycles of a 27960KX 1 Wait State, 4-Byte Read (1-0-0-0 Burst Read) in a 32-Bit System 



27960KX DEVICE NAMES 

The device names on the 27960KX were derived as 
mnemonics that correspond to the number of wait 
states and expected operating frequency for the de- 
vice. For example, the 25 MHz, 2 wait state 
27960KX is named 27960K2-25. 



AC TIMING DERIVATIONS 

The AC timings for the 27960KX were generated 
specifically to meet the requirements of the 
80960KA/KB microprocessor. In each case the ap- 
plicable 80960KA/KB clock frequency and AC tim- 
ing were taken together with an address buffer delay 
(if needed) and a 4 ns positive clock skew or a 2 ns 
negative clock skew (see Figure 6A) guardband to 



generate the 27960KX AC timing. Worst case tim- 
ings were always assumed. The example below 
shows how the 27960K1-20 tavcnh timing was de- 
rived. 

@20 MHz the clock cycle is ~ 50 ns. 
t 6 of the 80960KA/KB is 2-20 ns. 
4 ns clock skew guardband. 




27960K1-20 tavc h = 50 ns 
= 26 ns 



20 ns - 4 ns 



On timings such as this, where the EPROM is faster 
than the microprocessor, we specified the EPROM's 
timing leaving the excess time as system guard- 
band. 
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NOTE: 

The 27960KX allows a positive clock skew (CLK2 leading CLK) of up to 4 ns and a negative clock skew (CLK2 lagging 
CLK) of up to 2 ns. The larger positive clock skew takes into account longer trace lengths and heavier loading on the 1 x 
clock trace. 

Figure 6A. Definition of Positive and Negative Clock Skew 



50 MHz 
CLOCK 



80960KB 



Combinatorial 

PAL 

16L8-7 



Driver 
74F244 



CLK2 



CLK 



27960KX 



27960KX 



27960KX 



27960KX 



290237-12 

NOTE: 

CLK and CLK2 are generated by the same PAL. This minimizes skew between CLK and CLK2. Both PAL outputs are fed 
to a 74F244 driver. The EPROMs should be as close to the clock driver as possible. 

Figure 6B. Example Clock Circuit with Minimum Skew 
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NOTE: 

This clock generation circuit uses a 100 MHz oscillator. The EPROMs should be as close to the NAND drivers as 
possible. 




Figure 6C. Example Clock Circuit Using a 100 MHz Oscillator 



Decoders are needed for the systems address (chip 
select) decoding. For the 27960KX's timings we as- 
sumed a 5-10 ns chip select decoder for 16 MHz 
and 20 MHz frequencies and a 5-9 ns decoder for 
25 MHz systems. The example below shows how 
the 27960K2-25 tsvch timing was derived. 

@25 MHz the clock cycle is ~ 40 ns. 
t 6 of the 80960KA/KB is 2-18 ns. 
Decoder = 9 ns 
4 ns clock skew guardband 



27960K2-25 tsvch = 40 ns 

= 9 ns 



18 ns - 9 ns - 4 ns 



SYSTEM BUFFERING CONSIDERATIONS 

For many large system applications buffering may 
be required between the microprocessor and memo- 
ry devices. The 20 MHz - 2 WS and 16 MHz 
27960KX AC timings take this into account. For ap- 
plications at these frequencies not requiring buffer- 
ing these devices will provide an additional 5-10 ns 
of system guardband. 



The list below shows the buffers used in generating 
these timings: 

Input Output 

Buffer Buffer 
20 MHz 9 ns 5 ns 

16 MHz 10 ns 7 ns 

The 20 MHz buffers are slightly faster in keeping 
with the increased sensitivity for higher perform- 
ance. We chose the above buffers because of their 
wide availability. Significantly faster buffers are avail- 
able for applications requiring them. The example 
below shows tchqv for the 27960K2-20. 

@20 MHz the clock cycle is ~ 50 ns. 
tio of the 80960KA/KB is 3 ns. 
Output buffer for 20 MHz = 5 ns. 
4 ns clock skew guardband 

27960K2-20 tchqv = 50 ns - 5 ns - 3 ns - 4 ns 
= 38 ns 
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ABSOLUTE MAXIMUM RATINGS* 

Read Operating Temperature 0°C to + 70°C( 8 ) 

Case Temperature under Bias . . - 1 0°C to + 80°C(8) 

Storage Temperature -65°C to + 125°C 

All Input or Output Voltages - 0.6V to + 6.5V(4) 

with Respect to Ground 

Voltage on A 9 -0.6V to +' 13.0VW 

with Respect to Ground 

Vp P Supply Voltage - 0.6V to + 1 4.0V(4) 

with Respect to Ground 

V C c Supply Voltage -0.6V to + 7.0VK) 

with Respect to Ground 



NOTICE: This data sheet contains preliminary infor- 
mation on new products in production. The specifica- 
tions are subject to change without notice. Verify with 
your local Intel Sales office that you have the latest 
data sheet before finalizing a design. 



* WARNING: Stressing the device beyond the "Absolute 
Maximum Ratings" may cause permanent damage. 
These are stress ratings only. Operation beyond the 
"Operating Conditions" is not recommended and ex- 
tended exposure beyond the "Operating Conditions" 
may affect device reliability. 



DC CHARACTERISTICS: READ OPERATION 

0°C < T A < +70°C, V cc = 5V + 10%, TTL Inputs 



Symbol 


Parameter 


Notes 


Min 


Max 


Unit 


Test Condition 


Ili 


Input Load Current 






1 


jLtA 


Vim = 5.5V 


Ilo 


Output Leakage Current 






10 


juA 


V UT = 5.5V 


Ipp 


Vpp Load Current Read 






10 


JLtA 


Vpp = to V C c, PGM = V| H 


•SB 


Vcc Standby 


Switching 


2 




45 


mA 


CS = V|h, f = 25 MHz 


Stable 


2 




30 


mA 


CS = V, H 


ice 


Vcc Active Current 


1,3,7 




125 


mA 


CS = V| L , f = 25 MHz, louT ■ = mA 


VlL 


Input Low Voltage 


4 


-0.5 


0.8 


V 




V| H 


Input High Voltage 




2.0 


Vcc+1 


V 




Vol 


Output Low Voltage 






0.45 


V 


Iql = 2,1mA 


Voh 


Output High Voltage 


5 


V C C -0.8 




V 


Iqh = -100 juA 


5 


2.4 




V 


Iqh = -400jliA 


'os 


Output Short Circuit 


6 




100 


mA 





NOTES: 

1 . Maximum current is with outputs unloaded. 

2. Ice standby current assumes no output loading, i.e., Iqh = 'ol = ° mA - 

3. Ice is the sum of current through Vcc3 + Vqc4 and does not include the current through Vcci and Vcc2- ( V CC1 and 
Vcc2 supply power to the output drivers. Vcc3 and Vcc4 supply power to the rest of the device.) 

4. Minimum DC voltage on input and output pins is —0.5V. During transitions, this level may undershoot to -2.0V for 
periods less than 20 ns. 

5. Maximum DC voltage on input and output pins is Vcc + °- 5V which may overshoot to Vcc + 2.0V for periods less than 
20 ns. 

6. One output shorted for no more than one second, los. is sampled but not 100% tested. 

7. Ice max measured with a 10.11 /xF capacitor between Vcc and Vss- 

8. This specification defines commercial product operating temperatures. 



4-48 



27960KX 



IPKiyiMBKlAISV 



EXPLANATIOM OF AC SYMBOLS 

The nomenclature used for timing parameters are as 
per IEEE STD 662-1980 IEEE Standard Terminology 
for Semiconductor Memory. 

Each timing symbol has five characters. The first is 
always a "t" (for time). The sec ond c haracter repre- 
sents a signal name, e.g., (CLK, ALE, etc.). The third 
character represents the signal's level (high or low) 
for the signal indicated by the second character. The 
fourth character represents a signal name at which a 
transition occurs marking the end of the time interval 
being specified. 



The fifth character represents the signal level indi- 
cated for the fourth character. The list below shows 
character representations. 



A: 


Address R: 


Reset 


B: 


BLAST Q: 


Data 


C: 


Clock S: 


CS 


H: 


Logic High Level t: 


Time 


L: 


ALE/ Logic Low Level V: 


Valid 


P: 


Vpp Programming Voltage Z: 


Tri-state level 


X: 


No longer a valid "driven" logic 


level 



AC CHARACTERISTICS: READ OPERATION 


0°C < 


T A < +70°C, 


v C c = 


5V ±10% 




Versions 


27960K2-25 


27960K1-20 


27960K2-20 


27960K1-16 


Unit 


25 MHz 
2 Wait States 


20 MHz 
1 Wait State 


20 MHz 
2 Wait States 


16 MHz 
1 Wait State 


No 


Symbol 


Characteristic 


Notes 


Min 


Max 


Min 


Max 


Min 


Max 


Min 


Max 


1 


*AVC H 


Address Valid to 
CLK High 


CLKO 


12 




18 




10 




15 




ns 


2 


UVLH 


Address Valid 
to ALE High 




10 




10 




10 




10 




ns 


3 


t|_LLH 


ALE Low to ALE High 




12 




12 




12 




12 




ns 


4 


tlJHAX 


ALE High to 
Address Invalid 




8 




8 




8 




8 




ns 


5 


tSVCH 


CS Valid 
to CLK High 


1,5 


5 




8 




7 




8 




ns 


6 


tC N HSX 


CLK High to CS 
Invalid 


2 






















ns 


7 


tCHQV 


CLK High to Data Valid 


7 




33 




43 




38 




45 


ns 


8 


tCHQX 


CLK High to Data Invalid 




7 




7 




7 




7 




ns 


9 


tCHQZ 


CLK High to Data High-Z 


6 




30 




35 




35 




35 


ns 


10 


tBVCH 






15 




15 




15 




15 




ns 


BLAST Valid to 
CLK High 


11 


tCHBX 


CLK High to 
BLAST Invalid 


3 


5 


35 


5 


45 


5 


45 


5 


45 


ns 



NOTES: 

1. Valid signal level is meant to be either a logic high or logic low. 

2. tc N HSX — The subscript N represents the number of wait states for this parameter. CS can be de-asserted (high) after the 
nu mber of wait states (N) has expired. The EPROM will continue to burst out data for the current cycle. 

3. BLAST must be returned high before the next rising clock edge. 

4. The sum of tcHQV + tAvcH + NCLK will not equal actual tAVQV it independent test conditions are used to obtain tAvcH 
andJcHQV (N = number of wait states). 

5. CS must be deasserted after every burst read (see Figure 7). 

6. Sampled, not 100% tested. The transition is measured ±500 mV from steady state voltage. 

7. For capacitive loads above 120 pF, tcHQV can De derated by 1 ns/20 pF. 
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AC CONDITIONS OF TEST 

Input Rise and Fall Times 

(10% to 90%) 4 ns 

Input Pulse Levels 0.45V to 2.4V 

Input Timing Reference Level 1 .5V 

Output Timing Reference Level 0.8V and 2.0V 









Table 2 


. Mode Table 










MODE 


CS 


PGM 


BLAST 


ALE 


RESET 


A 9 


Vpp 


Vcc 


OUTPUT 


Read 


V|L 


V|H 


V| H (D 


V,H<2> 


V|H 


X(4) 


Vcc 


Vcc 


DouT 


Standby (6) 


V|H 


X 


X 


X 


V| H 


X 


v C c< 5 > 


Vcc 


HighZ 


Program 


V|L 


V|L 


V|H 


V|H< 2 > 


V|H 


X 


(3) 


(3) 


D|N 


Program Verify 


V|L 


V|H 


V IH (D 


VlH 


V|H 


X 


(3) 


(3) 


DOUT 


Program Inhibit 


V|H 


X 


X 


X 


V|H 


X 


(3) 


(3) 


HighZ 


ID Byte 0: Manufacturer 


V|L 


V|H 


V| H (D 


V, H (2) 


V|H 


V 1D (3) 


Vcc 


Vcc 


89H 


ID Byte 1: Part (27960) 


V|L 


V|H 


V| H (D 


V, H (2) 


V| H 


V| D (3) 


Vcc 


Vcc 


E0H 


ID Byte 2: KX 


V|L 


V|H 


V IH (D 


V| H (2) 


V| H 


V| D (3) 


Vcc 


Vcc 


00B 


ID Byte 3: 1 Wait-State 
2 Wait-States 


V|L 


V|H 


V| H (D 


VlH* 2 ) 


V| H 


V| D <3) 


Vcc 


Vcc 


01 B 
10B 


Reset 


X 


X 


X 


X 


V|L 


X 


Vcc 


Vcc 


High Z 




NOTES: 

1 . Vih until data terminated at which time BLAST must go to V|l- 

2. Need to toggle from Vm to Vn_ to Vm to latch address. 

3. See DC Programming Characteristics for Vcc. Vid and Vpp voltages. 

4. X can be Vil or Vm- 

5. Vpp = Vcc to meet standby current specification. Vcc > Vpp > Vil will c ause a slight increase in standby current. 

6. The device must be in the idle state (by asserting RESET or using BLAST) before going into standby. 
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CAPACITANCE(I) T A = 25°c, f = 1.0 MHz 


Symbol 


Parameter 


Typ 


Max 


Unit 


Condition 


C|N 


Input Capacitance 


4 


6 


PF 


V, N =0V 


GOUT 


Output Capacitance 


12 


15 


PF 


V UT=OV 


Cvpp 


Vpp Capacitance 


40 


45 


PF 


V, N =0V 



AC INPUT/OUTPUT REFERENCE WAVEFORMS 



v OH 
INPUT 



~x 



TIMING PARAMETER 



Y ou 



290237-14 

AC test inputs are driven at 2.4V (Voh) for a logic '1' 

and 0.45V (V |_) for a logic '0'. 

Input timing begins at 1 .5V. 

Output timing ends at V|h (2.0V) and V| L (0.8V) 

Input Rise and fall times (10% to 90%) < 4.0 ns 



AC TESTING LOAD CIRCUIT 



DEVICE 
UNDER 
TEST 



2.1V 

I 



:CL=120pF 



290237-15 



For tcHQZ Cl = 5 pF and Rl = 405H 
Cl includes jig capacitance 



CLOCK CHARACTERISTICS 



^ Versions 


25 MHz 


20 MHz 


16 MHz 


Units 


Symbol 


Parameter 


Min 


Max 


Min 


Max 


Min 


Max 


CLK 


Period 


40 




50 




62.5 




ns 


T 5 


Rise Time 




1.0 




10 




10 


ns 


T 4 


Fall Time 




10 




10 




10 


ns 


T 2 


Low Time 


7 




8 




11 




ns 


T 3 


High Time 


7 




8 




11 




ns 


Max CLK Rise Time during Programming is 1 00 ns 



CLOCK WAVEFORM 
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Program/Program Verify 

Initially, and after each erasure, all bits of the 
EPROM are in the "1's" state. Data is introduced by 
selectively programming "O's" into the desired bit 
locations. Although only "O's" can be programmed, 
both "1's" and "O's" can be present in the data 
word. Ultraviolet erasure is the only way to change 
"O's" to "1's". 

Program mode is entered when Vpp is raised to 
12.75V. Program/Verify operation is synchronous 
with the clock and can only be initiated following an 
idle state. Program and Program Verify take place in 
3 clock cycles. In the first clock cycle, addresses 
and data are input and programming occurs. Pro- 
gram Verify follows in the second clock cycle and 
the third clock cycle terminates synchronous Pro- 
gram/Verify operation, returning the state machine 
to the idle state with outputs at high impedance. 

As in the Read mode, A2-A-16 point to a four byte 
block in the memory array. During Programming the 
internal address increment circuitry is disabled and 
the programmer must supply A and A1 to point to 
an individual byte within the four byte block that is to 
be programmed. Only one byte is programmed in 
each 3 cycle program/Verify sequence. 



Program Inhibit 

Program Inhibit mode allows parallel programming 
and verification of multiple devices with different 
data. With Vpp at 12.75V, a Program/Verify se- 
quence is initiated for any device that receives a val- 
id ALE p ulse a nd rising clock edge while CS is as- 
serted. A PGM pulse programs data in the first cycle 
of the sequence and data for Program Verify is out- 
put in the second cycle. The Program/Verify se- 
quence is inhibited on any de vices for which CS is 
not asserted during the first (ALE) cycle. Data will 
not be programmed and the outputs will remain in 
their high impedance state. 



int e ligent Identifier™ Mode 

The device's manufacturer, product type, and con- 
figuration are stored in a four byte block that can be 



accessed by using the int e ligent Identified mode. 
The programmer can verify the device identifier and 
choose the programming algorithm that corresponds 
to the Intel 27960KX. The int e ligent Identifier can 
also be used to verify that the product is configured 
with the desired Read mode options for wait states. 

Int e ligent Identifier mode is entered when Ag (pin 
32) is raised to its high voltage (Vh) level. The inter- 
nal state machine is then set for int e ligent Identifier 
Read operation. Reading the Identifier is similar to a 
Read operation on a one wait state configured prod- 
uct. Up to four bytes can be read in a single burst 
access. int e li gent Ide ntifier read is terminated by a 
synchronous BLAST input, returning the state ma- 
chine to the idle state with outputs at high imped- 
ance. 

The four byte block code for the int e ligent Identifier 
code is located at address 00H through 03H and is 
encoded as follows: 



MEANING 


(Ai.Ao) 


DATA 


Intel ID 


Byte 00 


89h 


27960 


Byte 01 


EOh 


KX 


Byte 10 


00b 


1 wait state 


Byte 11 


01b 


2 wait states 


Byte 1 1 


10b 


RESET MODE 








Due to the synchronous nature of the 27960KX, the 
various operating modes must be initiated from a 
known idle state. During normal operation, the inter- 
nal state machine returns to an idle sta te at the ter- 
mination of a bus access (after BLAST is asserted). 

During initial device power up, the state machine is 
in an indeterminant state. The reset mode is provid- 
ed to force operation in to the idle state. Reset mode 
is entered when the RESET pin is asserted. Output 
pins are asynchronously set to the high impedance 
state and address latches are put into the flow 
through mode. A reset is successfully completed 
and the st ate ma chine set in an idle state in the 
cycle after RESET has been asserted for a minimum 
of 10 clock cycles and deasserted for five clock cy- 
cles. 
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Figure 8. Quick-Pulse Programming**/! Algorithm 
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QUICK-PULSE PROGRAMMING 
ALGORITHM 

The Quick-Pulse Programming algorithm programs 
Intel's 27960KX. Developed to substantially reduce 
programming throughput time, this algorithm allows 
optimized equipment to program a 27960KX in un- 
der 17 seconds. Actual programming time depends 
on the programmer used. 

The Quick-Pulse Programming algorithm uses a 
1 00 jus pulse followed by a byte vesication to deter- 
mine when the addressed byte is correctly pro- 
grammed. The algorithm terminates if 25 100juls 



pulses fail to program a byte. Figure 8 shows the 
27960KX Quick-Pulse Programming algorithm flow- 
chart. 

The entire program-pulse, byte-verify sequence is 
performed with V C c = 6.25V and V PP = 12.75V. 
The programming equipment must establish Vcc be- 
fore applying voltages to any other pins. When pro- 
gramming is complete, all bytes should be compared 
to the original data with Vcc = 5 -°V and Vpp = 
12.75V. 



D.C. PRC 


)GRAMMING CHARACTE 


RBSTIC! 


S T A '= 25°C 


±5°C 






Symbol 


Parameter 


Notes 


Min 


Max 


Unit 


Test Condition 


Ili 


Input Load Current 






10 


JLtA 


Vin = V| H orV| L 


'cc 


Vcc Program Current 


1 




125 


mA 


CS = Vil 


Ipp 


Vpp Program Current 


1 




50 


mA 


CS = V| L 


V|L 


Input Low Voltage 




-0.5 


0.8 


V 




V| H 


Input High Voltage 




2.0 


Vcc + 0.5 


V 




Vol 


Output Low Voltage (Verify) 






0.40 


V 


Iol = 2.1 mA 


V H 


Output High Voltage (Verify) 




V C C -0.8 




V 


Iqh = -400ju,A 


V|D 


Ag inteligent Identifier Voltage 




11.5 


12.5 


V 




Vcc 


Supply Voltage (Program) 


2 


6.0 


6.5 


V 




Vpp 


Program Voltage 


2 


12.5 


13.0 


V 






NOTES: 

1. The maximum current value is with outputs unloaded. 

2. Vcc mus t be applied simultaneously or before Vpp and remove simultaneously or after Vpp. 

3. During programming clock levels are Vm and Vn_. 
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AC PROGRAMMING, RESET AND ID CHARACTERISTICS T A = 25°C ± 5°C 


No 


Symbol 


Parameter 


Notes 


Min 


Max 


Units 


1 


UVPL 


Address Valid to PGM Low 




2 




jutS 


2 


tCHAX 


CLK High to Address Invalid 




50 




ns 


3 


tLLCH 


ALE Low to CLK High 


1 


50 




ns 


4 


tCHLH 


CLK High to ALE High 


2 


50 




ns 


5 


tsVCH 


CS Valid to CLK High 




50 




ns 


6 


*CHSX 


CLK High to CS Invalid 


3 






ns 


7 


tfJHQV 


CLK High to Dqut- Valid 






100 


ns 


8 


tfJHQX 


CLK High to D ut Invalid 









ns 


9 


tBVCH 


BLAST Valid to CLK High 




50 




ns 


10 


tCHBX 


CLK High to BLAST Invalid 


4 


50 




ns 


11 


tQVPL 


DATA Valid to PGM Low 




2 




JUS 


12 


tpLPH 


PGM Program Pulse Width 




95 


105 


jUtS 


13 


tpHQX 


PGM High to D| N Invalid 




2 




juiS 


14 


tCLPL 


CLK Low to PGM Low 




.50 




ns 


15 


tQZCH 


D| N in Tri-State to CLK High 




2 




JLlS 


16 


tvcs 


Vcc Program Voltage to CLK High 


7 


2 




JUtS 


17 


typs 


Vpp Program Voltage to CLK High 


7 


2 




JLlS 


18 


UgHCH 


A 9 V| D Voltage to CLK High 




2 




jaS 


19 


tCHAgX 


CLK High to A9 not V !D Voltage 




2 




JUS 


20 


*RVCH 




6 


50 




ns 


RESET Valid to CLK High 


21 ' 


tCHCL 


CLK High to CLK Low 


5 


100 




ns 


22 


tCLCH 


CLK Low to CLK High 


5 


100 




ns 



NOTES: 

1 . If CS is low, ALE can go low no sooner than the falling edge of the previous CLK. 

2. ALE must return high prior to the next rising edge of clock. 

3. CS mus t remain low until after the rising edge CLK1 . 

4. BLAST must return high prior to the next rising edge of CLK. 

5. Max CL K rise/fall time is 100 ns. 

6. RESET must be held low for 10 cycles and high for 5 cycles before performing a read. 

7. Vcc must be applied simultaneously or before Vpp and removed simultaneously or after Vpp. 
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82596CA 

HIGH-PERFORMANCE 32-BIT LOCAL 

AREA NETWORK COPROCESSOR 



m Performs Complete CSMA/CD Medium 
Access Control (MAC) Functions— 
Independently of CPU 
•—IEEE 802.3 (EOC) Frame Delimiting 

— HDLC Frame Delimiting 

H Supports Industry Standard LANs 

— IEEE TYPE 10BASE-T, 

IEEE TYPE 10BASE5 (Ethernet*), 
IEEE TYPE 10BASE2 (Cheapernet), 
IEEE TYPE 1BASE5 (StarLAN), 
and the Proposed Standard 
10BASE-F 

— Proprietary CSMA/CD Networks Up 
to 20 Mb/s 

m On-Chip Memory Management 

— Automatic Buffer Chaining 

— Buffer Reclamation after Receipt of 
Bad Frames; Optional Save Bad 
Frames 

— 32-Bit Segmented or Linear (Flat) 
Memory Addressing Formats 

m Network Management and Diagnostics 

— Monitor Mode 

— 32-Bit Statistical Counters 

m 82586 Software Compatible 



Pi Optimized CPU Interface 

— Optimized Bus Interface to Intel's 
J486TMDX, i486TMSX and 80960CA 
Processors 

— Supports Big Endian and Little 
Endian Byte Ordering 

m 32-Bit Bus Master Interface 
— 106 MB/s Bus Bandwidth 

— Burst Bus Transfers 

— Bus Throttle Timers 

— Transfers Data at 100% of Serial 
Bandwidth 

— 128-Byte Receive FIFO, 64-Byte 
Transmit FIFO 

El Self-Test Diagnostics 

a Configurable Initialization Root for Data 
Structures 

G High-Speed, 5V, CHMOS** IV 
Technology 

□ 132-Pin Plastic Quad Flat Pack (PQFP) 
and PGA Package 

(See Packaging Spec Order No. 240800-001, 
Package Type KU and A) 

i486 is a trademark of Intel Corporation. 

* Ethernet is a registered trademark of Xerox Corporation. 
**CHMOS is a patented process of Intel Corporation. 
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INTRODUCTION 

The 82596CA is an intelligent, high-performance 
32-bit Local Area Network coprocessor. The 
82596CA implements the CSMA/CD access method 
and can be configured to support all existing IEEE 
802.3 standards— TYPEs 10BASE-T, 10BASE5, 
10BASE2, 1BASE5, and 10BROAD36. It can also be 
used to implement the proposed standard TYPE 
10BASE-F. The 82596CA performs high-level com- 
mands, command chaining, and interprocessor com- 
munications via shared memory, thus relieving the 
host CPU of many tasks associated with network 
control. All time-critical functions are performed in- 
dependently of the CPU, this increases network per- 
formance and efficiency. The 82596CA bus interfac- 
es is optimized for Intel's i486TMSX, i486TMDX, 
80960CA, and 80960KB processors. 

The 82596CA implements all IEEE 802.3 Medium 
Access Control and channel interface functions, 
these include framing, preamble generation and 
stripping, source address generation, destination ad- 
dress checking, short-frame detection, and automat- 
ic length-field handling. Data rates up to 20 Mb/s are 
supported. 

The 82596CA provides a powerful host system inter- 
face. It manages memory structures automatically, 
with command chaining and bidirectional data chain- 
ing. An on-chip DMA controller manages four chan- 
nels, this allows autonomous transfer of data blocks 
(buffers and frames) and relieves the CPU of byte 
transfer overhead. Buffers containing errored or col- 
lided frames can be automatically recovered without 
CPU intervention. The 82596CA provides an up- 
grade path for existing 82586 software drivers by 
providing an 82586-software-compatible mode that 
supports the current 82586 memory structure. The 
82586CA also has a Flexible memory structure and 
a Simplified memory structure. The 82596CA can 
address up to 4 gigabytes of memory. The 82596CA 
supports Little Endian and Big Endian byte ordering. 

The 82596CA bus interface can achieve a burst 
transfer rate of 106 MB/s at 33 MHz. The bus inter- 
face employs bus throttle timers to regulate 
82596CA bus use. Two large, independent FIFOs — 
1 28 bytes for Receive and 64 bytes for Transmit — 
tolerate long bus latencies and provide programma- 
ble thresholds that allow the user to optimize bus 
overhead for any worst-case bus latency. The high- 
performance bus is capable of back-to-back trans- 
mission and reception during the IEEE 802.3 9.6- jus 
Interframe Spacing (IFS) period. 

The 82596CA provides a wide range of diagnostics 
and network management functions, these include 
internal and external loopback, exception condition 



tallies, channel activity indicators, optional capture 
of all frames regardless of destination address 
(promiscuous mode), optional capture of errored or 
collided frames, and time domain reflectometry for 
locating fault points on the network cable. The sta- 
tistical counters, in 32-bit segmented and linear 
modes, are 32-bits each and include CRC errors, 
alignment errors, overrun errors, resource errors, 
short frames, and received collisions. The 82596CA 
also features a monitor mode for network analysis. 
In this mode the 82596CA can capture status bytes, 
and update statistical counters, of frames monitored 
on the link without transferring the contents of the 
frames to memory. This can be done concurrently 
while transmitting and receiving frames destined for 
that station. 

The 82596CA can be used in both baseband and 
broadband networks. It can be configured for maxi- 
mum network efficiency (minimum contention over- 
head) with networks of any length. Its highly flexible 
CSMA/CD unit supports address field lengths of 
zero through six bytes — configurable to either IEEE 
802.3/Ethemet or HDLC frame delimitation. It also 
supports 16- or 32-bit cyclic redundancy checks. 
The CRC can be transferred directly to memory for 
receive operations, or dynamically inserted for trans- 
mit operations. The CSMA/CD unit can also be con- 
figured for full duplex operation for high throughput 
in point-to-point connections. 



82596 B-Stepping 

The 82956 B-Step incorporates new features com- 
pared to the 82596 A1 stepping. The following is a 
summary of the 82596 B-step new features. 

o The 82596 B-step transmit buffers can now be 
byte aligned. 

• In big endian mode, and when configured to Lin- 
ear mode, the 82596 B-step treats 32-bit address 
pointers as big endian 32-bit entities. However, 
the SCB absolute address and statistical coun- 
ters are still treated as two 16-bit big endian enti- 
ties. This big endian 32-bit entity support is con- 
figured through the SYSBUS byte; not setting this 
mode will configure the 82596 B-step to be 100% 
compatible to the 82596 A1-step big endian 
mode. 

• The 82596 B-step has improved performance on 
back-to-back frame transmission. 

• The 82596 B-step can be configured to reread 
the next Command Block on the CB list upon re- 
ceiving a CU RESUME Control Command. 

The 82596CA is fabricated with Intel's reliable, 5-V, 
CHMOS IV (process 648.8) technology. It is avail- 
able in a 132-pin PQFP or PGA package. 
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82596CA PGA Cross Reference by Pin Name 
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PIN DESCRIPTIONS 



Symbol 



PQFP 
Pin No. 



Type 



Name and Function 



CLK 



CLOCK. The system clock input provides the fundamental timing for 
the 82596. It is a 1X CLK input used to generate the 82596 clock and 
requires TTL levels. All external timing parameters are specified in 
reference to the rising edge of CLK. 



D0-D31 



14-53 



I/O 



DATA BUS. The 32 Data Bus lines are bidirectional, tri-state lines that 
provide the general purpose data path between the 82596 and 
memory. With the 82 596 th e bus can be either 16 or 32 bits wide; this 
is determined by the BS16 signal. The 82596 always drives all 32 data 
lines during Write operations, even with a 16-bit bus. D31 - DO are 
floated after a Reset or when the bus is not acquired. 
These lines are inputs during a CPU Port access; in this mode the CPU 
writes the next address to the 82596 through the data lines. During 
PORT commands (Relocatable SCP, Self-Test, Reset and Dump) the 
address must be aligned to a 1 6-byte boundary. This frees the D3-D0 
lines so they can be used to distinguish the commands. The following 
is a summary of the decoding data. 



DO D1 D2 D3 D31-D4 



0000 
ADDR 
ADDR 
ADDR 



Function 



Reset 

Relocatable SCP 
Self-Test 
Dump Command 



DP0-DP3 
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I/O 



DATA PARITY. These are tri-stated data parity pins. There is one 
parity line for each byte of the data bus. The 82596 drives them with 
even-parity information during write operations having the same timing 
as data writes. Likewise, even-parity information, with the same timing 
as read information, must be driven back to the 82596 over these pins 
to ensure that the correct parity check status is indicated by the 
82596. 



PCHK 



127 



PARITY CHECK. This pin is driven high one clock after RDY to inform 
Read operations of the parity status of data sampled at the end of the 
previous clock cycle. When driven low it indicates that incorrect parity 
data has been sampled. It only checks the parity status of enabled 
bytes, which are indicated by the Byte Enable and Bus Size signals. 
PCHK is only valid for one clock time after data read is returned to the 
82596; i.e., it is inactive (high) at all other times. 



A31-A2 



70-108 



ADDRESS LINES. These 30 tri-stated Address lines output the 
address bits required for memory operation. These lines are floated 
after a Reset or when the bus is not acquired. 



BE3-BE0 



109-114 



BYTE ENABLE. These tri-stated signals are used to indicate which 
bytes are involved with the current memory access. The number of 
Byte Enable signals asserted indicates the physical size of the data 
being transferred (1 , 2, 3, or 4 bytes), 
o BEO indicates D7- DO 
o BE1 indicates D15-D8 
o BE2 indicates D23-D16 
o BE3 indicates D31 -D24 
These lines are floated after a Reset or when the bus is not acquired. 



W/R 



120 



WRITE/READ. This dual function pin is used to distinguish Write and 
Read cycles. This line is floated after a Reset or when the bus is not 
acquired. 
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PIN DESCRIPTIONS (Continued) 



Symbol 


PQFP 
Pin No. 


Type 


Name and Function 


ADS 


124 





ADDRESS STATUS. The 82596 uses this tri-state pin to indicate to 
indicatejhat a valid bus cycle has begun and that A31 -A2, BE3-BE0, 
and W/R are being driven. It is asserted during t1 bus states. This line 
is floated after a Reset or when the bus is not acquired. 


RDY 


130 


I 


READY. Active low. This signal is the acknowledgment from 
addressed memory that the transfer cycle can be completed. When 
high, it causes wait states to be inserted. It is ignored at the end of the 
first clock of the bus cycle's data cycle. This active-low signal does not 
have an internal pull-up resistor. This signal must meet the setup and 
hold times to operate correctly. 


BRDY 


2 


I 


BURST READY. Active low. Burst Ready, like RDY, indicates that the 
external system has presented valid data on the data pins in response 
to a Read, or that the external system has accepted the 82596 data in 
response to a Write request. Also, like RDY, this signal is ignored at 
the end of the first clock in a bus cycle. If the 82596 can still receive 
data from the previous cycle, ADS will not be asserted in the next 
clock cycle; however, Address and Byte Enable will change to reflect . 
the next data item expected by the 82596. BRDY will be sampled 
. during each succeeding clock and if active, the data on the pins will be 
strobed to the 82596 or to external memory (read/write). BRDY 
operates exactly like READY during the last data cycle of a burst 
sequence and during nonburstable cycles. 


BLAST 


128 





BURST LAST. A signal (active low) on this tri-state pin indicates that 
the burst cycle is finished and when BRDY is next returned it will be 
treated as a normal ready; i.e., another set of addresses will be driven 
with ADS or the bus will go idle. BLAST is not asserted if the bus is not 
acquired. 


AHOLD 


117- 


I 


ADDRESS HOLD. This hold signal is active high, it allows another bus 
master to access the 82596 address bus. In a system where an 82596 
and an i486 processor share the local bus, AHOLD allows the cache 
controller to make a cache invalidation cycle while the 82596 holds the 
address lines. In response to a signal on this pin, the 82596 
immediately (i.e. during the next clock) stops driving the entire address 
bus (A31 -A2); the rest of the bus can remain active. For example, 
data can be returned for a previously specified bus cycle during 
Address Hold. The 82596 will not begin another bus cycle while 
AHOLD is active. 


BOFF 


116 


I 


BACKOFF. This signal is active low, it informs the 82596 that another 
bus master requires access to the bus before the 82596 bus cycle 
completes. The 82596 immediately (i.e. during the next clock) floats its 
bus. Any data returned to the 82596 while BOFF is asserted is ignored. 
BOFF has higher priority than RDY or BRDY; if two such signals are 
returned in the same clock period, BOFF is given preference. The 
82596 remains in Hold until BOFF goes high, then the 82596 resumes 
its bus cycle by driving out the address and status, and asserting ADS. 
BOFF should not be asserted during T1 . 


LOCK 


126 





LOCK. This tri-state pin is used to distinguish locked and unlocked bus 


cycles. LOCK generates a semaphore handshake to the CPU. LOCK 
can be active for several memory cycles, it goes active during the first 
locked memory cycle (t.1) and goes inactive at the last locked cycle 
(t2). This line is floated after a Reset or when the bus is not acquired. 
LOCK can be disabled via the sysbus byte in software. 
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PIN DESCRIPTIONS (Continued) 



Symbol 


PQFP 
Pin No. 


Type 


Name and Function 


BS16 


129 


I 


BUS SIZE. This signal allows the 82596CA to work with either 1 6- or 
32-bit bytes. Inserting BS16 low causes the 82596 to perform two 16- 
bit memory accesses when transferring 32-bit data. In little endian 
mode the D1 5-DO lines are driven when BS1 6 is inserted, in Big 
Endian mode the D31 -D1 6 lines are driven. 


HOLD 


123 





HOLD. The HOLD signal is active high, the 82596 uses it to request 
local bus mastership. In normal operation HOLD goes inactive before 
HLDA. The 82596 can be forced off the bus by deasserting HLDA or if 
the bus throttle timers expire. 


HLDA 


118 


I 


HOLD ACKNOWLEDGE. The HLDA signal is active high, it indicates 
that bus mastership has been given to the 82596. HLDA is internally 
synchronized; after HOLD is detected low, the CPU drives HLDA low. 

NOTE: 
Do not connect HLDA to Vcc — it will cause a deadlock. A user wanting 
to give the 82596 permanent access to the bus should connect HLDA 
to HOLD. If HLDA goes inactive before HOLD, the 82596 will release 
the bus (by deasserting HOLD) within a maximum of within a specified 
number of bus cycles as specified in the 82596 User's Manual. 


BREQ 


115 


I 


BUS REQUEST. This signal, when configured to an externally 
activated mode, is used to trigger the bus throttle timers. 




3 


I 


PORT. When this signal is received, the 82596 latches the data on the 
data bus into an internal 32-bit register. When the CPU is asserting this 
signal it can write into the 82596 (via the data bus). This pin must be 
activated twice during all CPU Port access commands. 


PORT 


RESET 


69 


I 


RESET. This active high, internally synchronized signal causes the 
82596 to terminate current activity. The signal must be high for at least 
five system clock cycles. After five system clock cycles and four TxC 
clock cycles the 82596 will execute a Reset when it receives a high 
RESET signal. When RESET returns to low the 82596 waits for the 
first CA signal and then begins the initialization sequence. 


LE/BE 


65 


I 


LITTLE ENDIAN/BIG ENDIAN/This dual-function pin is used to 
select byte ordering. When LE/BE is high, little endian byte ordering is 
used; when low, big endian byte ordering is used for data in frames 
(bytes) and for control (SCB, RFD, CBL, etc). 


CA 


119 


I 


CHANNEL ATTENTION. The CPU uses this pin to force the 82596 to 
begin executing memory resident Command blocks. The CA signal is 
internally synchronized. The signal must be high for at least one 
system clock. It is latched internally on the high to low edge and then 
detected by the 82596. 

The first CA after a Reset forces the 82596 into the initialization 
sequence beginning at location 00FFFFF6h or an SCP address written 
to the 82596 using CPU Port access. All subsequent CA signals cause 
the 82596 to begin executing new command sequences from the SCB. 


INT/INT 


125 





INTERRUPT. A high signal on this pin notifies the CPU that the 82596 
is requesting an interrupt. This signal is an edge triggered interrupt 
signal, and can be configured to be active high or low. 



1 


..'i..Ji'. 
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PIN DESCRIPTIONS (Continued) 



Symbol 


PQFP 
Pin No. 


Type 


Name and Function 


v C c 


17 Pins 




POWER. +5V ±10%. 


Vss 


17 Pins 




GROUND. V. 


TxD 


54 





TRANSMIT DATA. This pin transmits data to the serial link. It is high 
when not transmitting. 


TxC 


64 


I 


TRANSMIT CLOCK. This signal provides the fundamental timing for 
the serial subsystem. The clock is also used to transmit data 
synchronously on the TxD pin. For NRZ encoding, data is transferred 
to the TxD pin on the high to low clock transition. For Manchester 
encoding, the transmitted bit center is aligned with the low to high 
transition. Transmit clock must always be running for proper device 
operation. 


LPBK 


58 





LOOPBACK. This TTL-level control signal enables the loopback 
mode. In this mode serial data on the TxD input is routed through the 
82C501 internal circuits and back to the RxD output without driving the 
transceiver cable. To enable this signal, both internal and external 
loopback need to be set with the Configure command. 


RxD 


60 


I 


RECEIVE DATA. This pin receives NRZ serial data only. It must be 
high when not receiving. 


RxC 


59 


I 


RECEIVE CLOCK. This signal provides timing information to the 
internal shifting logic. For NRZ data the state of the RxD pin is 
sampled on the high to low transition of the clock. 


RTS 


57 





REQUEST TO SEND. When this signal is low the 82596 informs the 
external interface that it has data to transmit. It is forced high after a 
Reset or when transmission is stopped. 


CTS 


62 


I 


CLEAR TO SEND. An active-low signal that enables the 82596 to 
send data. It is normally used as an interface handshake to RTS. 
Asserting CTS high stops transmission. CTS is internally synchronized. 
If CTS goes inactive, meeting the setup time to the TxC negative edge, 
the transmission will stop and RTS will go inactive within, at most, two 
TxC cycles. 


CRS 


63 


I 


CARRIER SENSE. This signal is active low, it is used to notify the 
82596 that traffic is on the serial link. It is only used if the 82596 is 
configured for external Carrier Sense. In this configuration external 
circuitry is required for detecting traffic on the serial link. CRS is 
internally synchronized. To be accepted, the signal must remain active 
for at least two serial clock cycles (for CRSF = 0). 


COT 


61 


I 


COLLISION DETECT. This active-low signal informs the 82596 that a 
collision has occurred. It is only used if the 82596 is configured for 
external Collision Detect. External circuitry is required for collision 
detection. CDT is internally synchronized. To be accepted; the signal 
must remain active for at least two serial clock cycles (for CDTF = 0). 
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82596 AND HOST CPU INTERACTION 82596 BUS INTERFACE 



The 82596CA and the host CPU communicate 
through shared memory. Because of its on-chip 
DMA capability, the 82596 can make data block 
transfers (buffers and frames) independently of the 
CPU; this greatly reduces the CPU byte transfer 
overhead. 

The 82596 is a multitasking coprocessor that com- 
prises two independent logical units — the Command 
Unit (CU) and the Receive Unit (RU). The CU exe- 
cutes commands from shared memory. The RU han- 
dles all activities related to frame reception. The in- 
dependence of the CU and RU enables the 82596 to 
engage in both activities simultaneously — the CU 
can fetch and execute commands from memory 
while the RU is storing received frames in memory. 
The CPU is only involved with this process after the 
CU has executed a sequence of commands or the 
RU has finished storing a sequence of frames. 

The CPU and the 82596 use the hardware signals 
Interrupt (INT) and Channel Attention (CA) to initiate 
communication with the System Control Block 
(SCB), see Figure 4. The 82596 uses INT to alert the 
CPU of a change in the contents of the SCB, the 
CPU uses CA to alert the 82596. 

The 82596 has a CPU Port Access state that allows 
the CPU to execute certai n funct ions without ac- 
cessing memory. The 82596 PORT pin and data bus 
pins are used to enable this feature. The CPU can 
directly activate four operations when the 82596 is in 
this state. 

o Write an alternative System Configuration Pointer 
(SCP). This can be used when the 82596 cannot 
use the default SCP address space. 

o Write a different Dump Command Pointer and ex- 
ecute Dump. This can be used for troubleshoot- 
ing No Response problems. 

© The CPU can reset the 82596 via software with- 
out disturbing the rest of the system. 

• A self-test can be used for board testing; the 
82596 will execute a self-test and write the re- 
sults to memory. 



The 82596CA has bus interface timings and pin defi- 
nitions that are compatible with Intel's 32-bit 
i486TMSX and J486TMDX microprocessors. This 
eliminates the need for additional bus interface logic. 
Operating at 33 MHz, the 82596's bus bandwidth 
can be as high as 106 MB/s. Since Ethernet only 
requires 1.25 MB/s, this leaves a considerable 
amount of bandwidth for the CPU. The 82596 also 
has a bus throttle to regulate its use of the bus. Two 
timers can be programmed through the SCB: one 
controls the maximum time the 82596 can remain on 
the bus, the other controls the time the 82596 must 
stay off the bus (see Figure 5). The bus throttle can 
be programmed to trigger internally with HLDA or 
externally with BREQ. These timers can restrict the 
82596 HOLD activation time and improve bus utiliza- 
tion. 

82596 MEMORY ADDRESSING 

The 82596 has a 32-bit memory address range, 
which allows addressing up to four gigabytes of 
memory. The 82596 has three memory addressing 
modes (see Table 1). 

© 82586 Mode. The 82596 has a 24-bit memory 
address range. The System Control Block, Com- 
mand List, Receive Descriptor List, and Buffer 
Descriptors must reside in one 64-KB memory 
segment. Transmit and Receive buffers can re- 
side in a 24-bit address space. 

o 32-Bit Segmented Mode. The 82596 has a 32- 
bit memory address range. The System Control 
Block, Command List, Receive Descriptor List, 
and Buffer Descriptors must reside in one 64-KB 
memory segment. Transmit and Receive buffers 
can reside in a 32-bit address space. 

o Linear Mode. The 82596 has a 32-bit memory 
address range. Any memory structure can reside 
anywhere within the 32-bit memory address 
range. 
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Figure 4. 82596 and Host CPU Intervention 
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Figure 5. Bus Throttle Timers 
Table 1. 82596 Memory Addressing Formats 



Pointer or Offset 


Operation Mode 


82586 


32-Bit 
Segmented 


Linear 


ISCP Address 


24-Bit Linear 


32-Bit Linear 


32-Bit Linear 


SCB Address 


Base (24) + Offset (16) 


Base (32) '+ Offset (16) 


32-Bit Linear 


Command Block Pointers 


Base (24) + Offset (16) 


Base (32) + Offset (16) 


32-Bit Linear 


Rx Frame Descriptors 


Base (24) + Offset (16) 


Base (32) + Offset (16) 


32-Bit Linear 


Tx Frame Descriptors 


Base (24) + Offset (16) 


Base (32) + Offset (16) 


32-Bit Linear 


Rx Buffer Descriptors 


Base (24) + Offset (16) 


Base (32) + Offset (16) 


32-Bit Linear 


Tx Buffer Descriptors 


Base (24) + Offset (16) 


Base (32) + Offset (16) 


32-Bit Linear 


Rx Buffers 


24-Bit Linear 


32-Bit Linear 


32-Bit Linear 


Tx Buffers 


24-Bit Linear 


32-Bit Linear 


32-Bit Linear 
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Figure 6. 82596 Shared Memory Structure 



82596 SYSTEM MEMORY STRUCTURE 

The Shared Memory structure consists of four parts: 
the Initialization Root, the System Control Block, the 
Command List, and the Receive Frame Area (see 
Figure 6). 

The Initialization Root is in an established location 
known to the host CPU and the 82596 (00FFFFF6h). 
However, the CPU can establish the Initialization 
Root in another location by using the CPU Port ac- 
cess. This root is accessed during initialization, and 
points to the System Control Block. 



The System Control Block serves as a bidirectional 
mail drop for the host CPU and the 82596 CU and 
RU. It is the central point through which the CPU and 
the 82596 exchange control and status information. 
The SCB has two areas. The first contains instruc- 
tions from the CPU to the 82596. These include: 
control of the CU and RU (Start, Abort, Suspend, 
and Resume), a pointer to the list of CU commands, 
a pointer to the Receive Frame Area, a set of Inter- 
rupt Acknowledge bits, and the T-ON and T-OFF 
timers for the bus throttle. The second area contains 
status information the 82596 is sending to the CPU. 
Such as, the CU and RU states (Idle, Active 
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Ready, Suspended, No Receive Resources, etc.), in- 
terrupt bits (Command Completed, Frame Received, 
CU Not Ready, and RU Not Ready), and statistical 
counters. 

The Command List functions as a program for the 
CU; individual commands are placed in memory 
units called Command Blocks (CBs). These CBs 
contain the parameters and status of specific high- 
level commands called Action Commands; e.g., 
Transmit or Configure. 

Transmit causes the 82596 to transmit a frame. The 
Transmit CB contains the destination address, the 
length field, and a pointer to a list of linked buffers 
holding the frame that is to be constructed from sev- 
eral buffers scattered throughout memory. The 
Command Unit operates without CPU intervention; 
the DMA for each buffer, and the prefetching of ref- 
erences to new buffers, is performed in parallel. The 
CPU is notified only after a transmission is complete. 

The Receive Frame Area is a list of Free Frame De- 
scriptors (descriptors not yet used) and a list of user- 
prepared buffers. Frames arrive at the 82596 unso- 
licited; the 82596 must always be ready to receive 
and store them in the Free Frame Area. The Re- 
ceive Unit fills the buffers when it receives frames, 
and reformats the Free Buffer List into received- 
frame structures. The frame structure is, for all prac- 
tical purposes, identical to the format of the frame to 
be transmitted. The first Frame descriptor is refer- 
enced by the SCB. Unless the 82596 is configured 
to Save Bad Frames, the frame descriptor, and the 
associated buffer descriptor, which is wasted when 
a bad frame is received, are automatically reclaimed 
and returned to the Free Buffer List. 

Receive buffer chaining (storing incoming frames in 
a linked buffer list) significantly improves memory 
utilization. Without buffer chaining, the user must al- 
locate consecutive blocks of memory, each capable 
of containing a maximum frame (for Ethernet, 1518 
bytes). Since an average frame is about 200 bytes, 
this is very inefficient. With buffer chaining, the user 
can allocate small buffers and the 82596 will only 
use those that are needed. 

Figure 7 A-D illustrates how the 82596 uses the 
Receive Frame Area. Figure 7A shows an unused 
Receive Frame Area composed of Free Frame (De- 
scriptors and Free Receive Buffers prepared by the 
user. The SCB points to the first Frame Descriptor of 
the Frame Descriptor List. Figure 7B shows the 
same Receive Frame Area after receiving one 
frame. This first frame occupies two Receive Buffers 
and one Frame Descriptor — a valid received frame 
will only occupy one Frame Descriptor. After receiv- 



ing this frame the 82596 sets the next Free Frame 
Descriptor RBD pointer to the next Free RBD. Figure 
7C shows the RFA after receiving a second frame. 
In this example the second frame occupies only one 
Receive Buffer and one RFD. The 82596 again sets 
the RBD pointer. This process is repeated again in 
Figure 7D, showing the reception of another frame 
using one Receive Buffer; in this example there is an 
extra Frame Descriptor. 



TRANSMIT AND RECEIVE MEMORY 
STRUCTURES 

There are three memory structures for reception and 
transmission. The 82586 memory structure, the 
Flexible memory structure, and the Simplified memo- 
ry structure. The 82586 mode is selected by config- 
uring the 82596 during initialization. In this mode all 
the 82596 memory structures are compatible with 
the 82586 memory structures. 

When the 82596 is not configured to the 82586 
mode, the other two memory structures, Simplified 
and Flexible, are available for transmitting and re- 
ceiving. These structures can be selected on a 
frame-by-frame basis by setting the S/F bit in the 
Transmit Command and the Receive Frame De- 
scriptor (see Figures 29, 30, 41, and 42). The Simpli- 
fied memory structure offers a simple structure for 
ease of programming (see Figure 8). All information 
about a frame is contained in one structure; for ex- 
ample, during reception the RFD and data field are 
contained in one structure. 

The Flexible memory structure (see Figure 9) has a 
control field that allows the programmer to specify 
the amount of receive data the RFD will contain for 
receive operations and the amount of transmit data 
the Transmit Command Block will contain for trans- 
mit operations. For example, when the control field 
in the RFD is set to 20 bytes during a reception, the 
first 20 bytes of the data field are stored in the RFD 
(6 bytes of destination address, 6 bytes of source 
address, 2 bytes of length field, and 6 bytes of data) 
and the remainder of the data field is stored in the 
Receive Data Buffers. This is useful for capturing 
frame headers when header information is con- 
tained in the data field. The header information can 
then be automatically stored in the RFD partitioned 
from the Receive Data Buffer. 

The control field can also be used for the Transmit 
Command when the Flexible memory structure is 
used. The quantity of data field bytes to be transmit- 
ted from the Transmit Command Block is specified 
by the variable control field. 
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Figure 7. Frame Reception in the RFA 
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Figure 8. Simplified Memory Structure 
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Figure 9. Flexible Memory Structure 
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TRANSMITTING FRAMES 

The 82596 executes high-level Action Commands 
from the Command List in system memory. Action 
Commands are fetched and executed in parallel with 
the host CPU operation, thereby significantly improv- 
ing system performance. The format of the Action 
Commands is shown in Figure 10. Figure 28 shows 
the 82586 mode, and Figures 29 and 30 show the 
command formats of the Linear and 32-bit Segment- 
ed modes. 

A single Transmit command contains, as part of the 
command-specific parameters, the destination ad- 
dress and length field of the transmitted frame and a 
pointer to buffer area in memory containing the data 
portion of the frame. The data field is contained in a 
memory data structure consisting of a buffer de- 
scriptor (BD) and a data buffer — or a linked list of 
buffer descriptors and buffers — as shown in Figure 
11. 

Multiple data buffers can be chained together using 
the BDs. Thus, a frame with a long data field can be 
transmitted using several (shorter) data buffers 
chained together. This chaining technique allows the 
system designer to develop efficient buffer manage- 
ment. 

The 82596 automatically generates the preamble 
(alternating 1s and 0s) and start frame delimiter, 
fetches the destination address and length field from 
the Transmit command, inserts its unique address 
as the source address, fetches the data field speci- 
fied by the Transmit command, and computes and 
appends the CRC to the end of the frame (see Fig- 
ure 12). In the Linear and 32-bit Segmented mode 
the CRC can be optionally inserted on a frame-by- 
frame basis by setting the NC bit in the Transmit 
Command Block (see Figures 29 and 30). 

The 82596 can be configured to generate two types 
of start and end frame delimiters — End of Carrier 
(EOC) or HDLC. In EOC mode the start frame delimi- 
ter is 10101011 and the end frame delimiter is indi- 



cated by the lack of a signal after the last bit of the 
frame check sequence field has been transmitted. In 
EOC mode the 82596 can be configured to extend 
short frames by adding pad bytes (7Eh) during trans- 
mission, according to the length field. In HDLC mode 
the 82596 will generate the 01111110 flag for the 
start and end frame delimiters, and do standard bit 
stuffing and stripping. Furthermore, the 82596 can 
be configured to pad frames shorter than the speci- 
fied minimum frame length by appending the appro- 
priate number of flags to the end of the frame. 

When a collision occurs, the 82596 manages the 
jam, random wait, and retry processes, reinitializing 
DMA pointers without CPU intervention. Multiple 
frames can be sent by linking the appropriate num- 
ber of Transmit commands together. This is particu- 
larly useful when transmitting a message larger than 
the maximum frame size (1518 bytes for Ethernet). 











CONTROL 
FIELDS 


COMMAND STATUS 


_ ^ NEXT 
COMMAND 


COMMAND 


LINK FIELD - 
(POINTER TO NEXT COMMAND) 


PARAMETER FIELD 

(COMMAND-SPECiFIC 

PARAMETERS) 






290218-10 




Figure 10. Action Command Format 
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RECEIVING FRAMES 

To reduce CPU overhead, the 82596 is designed to 
receive frames without CPU supervision. The host 
CPU first sets aside an adequate receive buffer 
space and then enables the 82596 Receive Unit. 
Once enabled, the RU watches for arriving frames 
and automatically stores them in the Receive Frame 
Area (RFA). The RFA contains Receive Frame De- 
scriptors, Receive Buffer Descriptors, and Data Buff- 
ers (see Figure 13). The individual Receive Frame 
Descriptors make up a Receive Descriptor List 
(RDL) used by the 82596 to store the destination 
and source addresses, the length field, and the 
status of each frame received (see Figure 14). 

Once enabled, the 82596 checks each passing 
frame for an address match. The 82596 will recog- 
nize its own unique address, one or more multicast 
addresses, or the broadcast address. If a match is 
found the 82596 stores the destination and source 
addresses and the length field in the next available 
RFD. It then begins filling the next available Data 
Buffer on the FBL, which is pointed to by the current 
RFD, with the data portion of the incoming frame. As 
one Data Buffer is filled, the 82596 automatically 
fetches the next DB on the FBL until the entire frame 
is received. This buffer chaining technique is particu- 
larly memory efficient because it allows the system 
designer to set aside buffers to fit frames much 
shorter than the maximum allowable frame length. If 
AL-LOC = 1 , or if the flexible memory structure is 
used, the addresses and length field can be placed 
in the Receive Buffer. 

Once the entire frame is received without error, the 
82596 does the following housekeeping tasks. 

o The actual count field of the last Buffer Descrip- 
tor used to hold the frame just received is updat- 
ed with the number of bytes stored in the associ- 
ated Data Buffer. 

• The next available Receive Frame Descriptor is 
fetched. 

© The address of the next available Buffer Descrip- 
tor is written to the next available Receive Frame 
Descriptor. 

® A frame received interrupt status bit is posted in 
the SCB. 

• An interrupt is sent to the CPU. 

If a frame error occurs, for example a CRC error, the 
82596 automatically reinitializes its DMA pointers 
and reclaims any data buffers containing the bad 



frame. The 82596 will continue to receive frames 
without CPU help as long as Receive Frame De- 
scriptors and Data Buffers are available. 

82596 NETWORK MANAGEMENT 
AND DIAGNOSTICS 

The behavior of data communication networks is 
normally very complex because of their distributed 
and asynchronous nature. It is particularly difficult to 
pinpoint a failure when it occurs. The 82596 has ex- 
tensive diagnostic and network management func- 
tions that help improve reliability and testability. The 
82596 reports on the following events after each 
frame is transmitted. 

• Transmission successful. 

• Transmission unsuccessful. Lost Carrier Sense. 

• Transmission unsuccessful. Lost Clear to Send. 

• Transmission unsuccessful. A DMA underrun oc- 
curred because the system bus did not keep up 
with the transmission. 

• Transmission unsuccessful. The number of colli- 
sions exceeded the maximum allowed. 

• Number of Collisions. The number of collisions 
experienced during the frame. 

• Heartbeat Indicator. This indicates the presence 
of a heartbeat during the last Interframe Spacing 
(IFS) after transmission. 

When configured to Save Bad Frames the 82596 
checks each incoming frame and reports the follow- 
ing errors. 

• CRC error. Incorrect CRC in a properly aligned 
frame. 

• Alignment error. Incorrect CRC in a misaligned 
frame. 

• Frame too short. The frame is shorter than the 
value configured for minimum frame length. 

• Overrun. Part of the frame was not placed in 
memory because the system bus did not keep up 
with incoming data. 

• Out of buffer. Part of the frame was discarded 
because of insufficient memory storage space. 

• Receive collision. A collision was detected during 
reception. 

• Length error. A frame not matching the frame 
length parameter was detected. 
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Figure 13. Receive Frame Area Diagram 
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Figure 14. Receive Frame Descriptor 
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NETWORK PLANNING AND 
MAINTENANCE 

To properly plan, operate, and maintain a communi- 
cation network, the network management entity 
must accumulate information on network behavior. 
The 82596 provides a rich set of network-wide diag- 
nostics that can serve as the basis for a network 
management entity. 

Information on network activity is provided in the 
status of each frame transmitted. The 82596 reports 
the following activity indicators after each frame. 

© Number of collisions. The number of collisions 
the 82596 experienced while attempting to trans- 
mit the frame. 

• Deferred transmission. During the first transmis- 
sion attempt the 82596 had to defer to traffic on 
the link. 

The 82596 updates its 32-bit statistical counters af- 
ter each received frame that both passes address 
filtering and is longer than the Minimum Frame 
Length configuration parameter. The 82596 reports 
the following statistics. 

© CRC errors. The number of well-aligned frames 
that experienced a CRC error. 

© Alignment errors. The number of misaligned 
frames that experienced a CRC error. 

© No resources. The number of frames that were 
discarded because of insufficient resources for 
reception. 

• Overrun errors. The number of frames that were 
not completely stored in memory because the 
system bus did not keep up with incoming data. 

© Receive Collision counter. The number of colli- 
sions detected during receive. 

© Short Frame counter. The number of frames that 
were discarded because they were shorter than 
the configured minimum frame length. 

The 82596 can be configured to Promiscuous mode. 
In this mode it captures all frames transmitted on the 
network without checking the Destination Address. 
This is useful when implementing a monitoring sta- 
tion to capture all frames for analysis. 

A useful method of capturing frame headers is to 
use the Simplified memory mode, configure the 
82596 to Save Bad Frames, and configure the 
82596 to Promiscuous mode with space in the RFD 
allocated for specific number of receive data bytes. 



The 82596 will receive all frames and put them in the 
RFD. Frames that exceed the available space in the 
RFD will be truncated, the status will be updated, 
and the 82596 will retrieve the next RFD. This allows 
the user to capture the initial data bytes of each 
frame (for instance, the header) and discard the re- 
mainder of the frame. 

The 82596 also has a monitor mode for network 
analysis. During normal operation the receive func- 
tion enables the 82596 to receive frames that pass 
address filtering. These frames must have the Start 
of Frame Delimiter (SFD) field and must be longer 
than the absolute minimum frame length of 5 bytes 
(6 bytes in case of Multicast address filtering). Con- 
tents and status of the received frames are trans- 
ferred to memory. The monitor function enables the 
82596 to simply evaluate the incoming frames. The 
82596 can monitor the frames that pass or do not 
pass the address filtering. It can also monitor frames 
which do not have the SFD fields. The 82596 can be 
configured to only keep statistical information about 
monitor frames. Three options are available in the 
Monitor mode. These options are selected by the 
two monitor mode configuration bits available in the 
configuration command. 

When the first option is selected, the 82596 receives 
good frames that pass address filtering and trans- 
fers them to memory while monitoring frames that 
do not pass address filtering or are shorter than the 
minimum frame size (these frames are not trans- 
ferred to memory). When this option is used the 
82596 updates six counters: CRC errors, alignment 
errors, no resource errors, overrun errors, short 
frames and total good frames received. 

When the second option is selected, the receive 
function is completely disabled. The 82596 monitors 
only those frames that pass address filterings and 
meet the minimum frame length requirement. When 
this option is used the 82596 updates six counters: 
CRC errors, alignment errors, total frames (good and 
bad), short frames, collisions detected and total 
good frames. 

When the third option is selected, the receive func- 
tion is completely disabled. The 82596 monitors all 
frames, including frames that do not have a Start 
Frame Delimiter. When this option is used the 82596 
updates six counters: CRC errors, alignment errors, 
total frames (good and bad), short frames, collisions 
detected and total good frames. 
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STATION DIAGNOSTICS 
AND SELF-TEST 

The 82596 provides a large set of diagnostic and 
network management functions. These include inter- 
nal and external loopback and time domain reflec- 
tometry for locating fault points in the network cable. 
The 82596 ensures software reliability by dumping 
the contents of the 82596 internal registers into sys- 
tem memory. The 82596 has a self-test mode that 
enables it to run an internal self-test and place the 
results in system memory. 



82586 SOFTWARE COMPATIBILITY 

The 82596 has a software-compatible state in which 
all its memory structures are compatible with the 
82586 memory structure. This includes all the Action 
Commands, the Receive Frame Area (including the 
RFD, Buffer Descriptors, and Data Buffers), the Sys- 
tem Control Block, and the initialization procedures. 
There are two minor differences between the 82596 
in the 82586-Compatible memory structure and the 
82586. 

© When the internal and external loopback bits in 
the Configure command are set to 1 1 the 82596 
is in external loopback and the LPBK pin is acti- 
vated; in the 82586 this situation would produce 
internal loopback. 

© During a Dump command both the 82596 and 
82586 dump the same number of bytes; however, 
the data format is different. 



INITIALIZING THE 82596 

A Reset command is issued to the 82596 to prepare 
it for normal operation. The 82596 is initialized 
through two data structures that are addressed by 
two pointers, the System Configuration Pointer 
(SCP) and the Intermediate System Configuration 
Pointer (ISCP). The initialization procedure begins 
when a Channel Attention signal is asserted after 
RESET. The 82596 uses the address of the double 
word that contains the SCP as a default — 
00FFFFF4h. Before the CA signal is asserted this 
default address can be chang ed to any other avail- 
able address by asserting the PORT pin and provid- 
ing the desired address over the D31 -D4 pins of the 
address bus. Pins D3-D0 must be 0010; i.e., any 
alternative address must be aligned to 16-byte 
boundaries. All addresses sent to the 82596 must be 
word aligned, which means that all pointers and 
memory structures must start on an even address 
(A = zero). 



SYSTEM CONFIGURATION POINTER 
(SCP) 

The SCP contains the sysbus byte and the location 
of the next structure of the initialization process, the 
ISCP. The following parameters are selected in the 
SYSBUS. 

o The 82596 operation mode. 

o The Bus Throttle timer triggering method. 

o Lock enabled. 

© Interrupt polarity. 

© Big Endian 32-bit entity mode. 

Byte ordering is determined by the LE/BE pin. 
LE/BE =1 selects Little Endian byte ordering and 
LE/BE = selects Big Endian byte ordering. 

NOTE: 

In the following, X indicates a bit not checked 
82586 mode. This bit must be set to in all other 
modes. 
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The following diagram illustrates the format of the SCP. 



31 ODD WORD 16 


15 EVEN WORD 


XXXXXXXX 


SYSBUS 





00 00000 


XXXXXXXX 


XXXXXXXX 


XXXXXXXX 


XXXXXXXX 


A31 A24 


A23 ISCP ADDRESS AO 



0FFFFF8h 
AO OFFFFFCh 



A31 A24 are not checked in 82586 mode 

X . . .X areas are not checked in 82586 mode; they must be in all other modes. 



23 














16 


I. 


1 


INT 


LOCK 


f RG 


M1 


MO 


X 



0- The 32-bit address pointers in Linear mode are treated 

as two 16-bit big endian entities. This is identical to 
the 82596 A1 stepping definition. 

1 - The 32-bit address pointers in Linear mode are treated 
as 32-bit big endian entities. This mode is only supported 
in the 82596 B stepping. In this mode the SCB absolute 
address and statistical counters are still treated as two 
16-bit big endian entities. 



Interrupt polarity 

- Interrupt pin is active 

high 

1 - Interrupt pin is active 

low 



Y. 



■00: 

1 

1 : 

1 1 : 



NOT CHECKED 

82586 mode 

32-Bit Segmented mode 

Linear mode 

Reserved 



- : internal triggering of the 
Bus Throttle timers 
1 : external triggering of the 
Bus Throttle timers 



; Lock function enabled 
Lock function disabled 



ISCP ADDRESS— The physical address of the ISCP. In the 82586 mode, bits A31-A24 are considered to 
be zero. 



Figure 15. The System Configuration Pointer 



Writing the Sysbus 



When writing the sysbus byte it is important to pay attention to the byte order. 

© When a Little Endian processor is used, the sysbus byte is located at byte address 00FFFFF6h (or address 
n + 2 if an alternative SCP address n was programmed). 

• When a processor using Big Endian byte ordering is used, the sysbus, alternative SCP, and ISCP addresses 
will be different. 

• The sysbus byte is located at 00FFFFF5h. 

• If an alternative SCP address is programmed, the sysbus byte should be at byte address n+ 1. 
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INTERMEDIATE SYSTEM CONFIGURATION POINTER (ISCP) 

The ISCP indicates the location of the System Control Block/Often the SCP is in ROM and the ISCP is in RAM. 
The CPU loads the SCB address (or an equivalent data structure) into the ISCP and asserts GA. This Channel 
Attention signal causes the 82596 to begin its initialization procedure and to get the SCB address from the 
ISCP and SCP. In 82586 and 32-bit Segmented modes the SCP base address is also the base address of all 
Command Blocks, Frame Descriptors, and Buffer Descriptors (but not buffers). All these data structures must 
reside in one 64-KB segment; however, in Linear mode no such limitation is imposed. 

The following diagram illustrates the ISCP format. 



31 


ODD WORD 


16 


EVEN WORD 
15 8 7 







A15 


SCB OFFSET 


AO 




BUSY 




A23 


SCB BASE ADDRESS 




AO 



ISCP 
AO] ISCP + 4 

T 

XXXXXXXX — in 82586 mode 

A31 A24 — in 32-bit segmented mode. 

BUSY — Indicates that the 82596 is being initialized. The CPU sets the ISCP to 01 h before it gives 

the first CA to the 82596. The ISCP is cleared by the 82596 after the SCB base and offset 
are read. Note that the most significant byte of the first word of the ISCP is not modified 
when BUSY is cleared. 

SCB OFFSET— This 16-bit quantity specifies the offset portion of the address of the SCB. 

SCB BASE — Specifies the base portion of the address of the SCB. The base of SCB is also, the base of 
all 82596 Command Blocks, Frame Descriptors and Buffer Descriptors. In the 82586 
mode, bits A31 -A24 are considered to be zero. 




Figure 16. The Intermediate System Configuration Pointer— 82586 and 32-Bit Segmented Modes 
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BUSY — Indicates that the 82596 is being initialized. The ISCP is set to 01 h by the CPU before its 

first CA to the 82596. It is cleared by the 82596 after the SCB address is read. 

SCB ADDRESS— This 32-bit quantity specifies the physical address of the SCB. 



Figure 17. The Intermediate System Configuration Pointer — Linear Mode. 

INITIALIZATION PROCESS 

The CPU sets up the SCP, ISCP, and the SCB structures, and, if desired, an alternative SCP address. It also 
sets BUSY to 01 h. The 82596 is initialized when a Channel Attention signal follows a Reset signal, causing the 
82596 to access the System Configuration Pointer. The sysbu s byte, the operational mode, the bus throttle 
timer triggering method, the interrupt polarity, and the state of LOCK are read. After reset the Bus Throttle 
timers are essentially disabled— the T-ON value is infinite, the T-OFF value is zero. After the SCP is read, the 
82596 reads the ISCP and saves the SCB address. In 82586 and 32-bit Segmented modes this address is 
represented as a base address plus the offset (this base address is also the base address of all the control 
blocks). In Linear mode the base address is also an absolute address. The 82596 clears BUSY, sets CX and 
CNR to equal 1 in the SCB, clears the SCB command word, sends an interrupt to the CPU, and awaits another 
Channel Attention signal. RESET configures the 82596 to its default state before CA is asserted. 
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CONTROLLING THE 82596CA 

The host CPU controls the 82596 with the commands, data structures, and methods described In this section. 
The CPU and the 82596 communicate through shared memory structures. The 82596 contains two indepen- 
dent units: the Command Unit and the Receive Unit. The Command Unit executes commands from the CPU, 
and the Receive Unit handles frame reception. These two units are controlled and monitored by the CPU 
through a shared memory structure called the System Control Block (SCB). The CPU and the 82596 use the 
CA and INT signals to communicate with the SCB. 



82596 CPU ACCESS INTERFACE (PORT) 

The 82596 has a CPU access interface that allows the host CPU to do four things. 

• Write an alternative System Configuration Pointer address. 

• Write an alternative Dump area pointer and perform Dump. 

• Execute a software reset. 

• Execute a self-test. 

The following events initiate the CPU access state. 

• Presence of an address on the D31-D4 data bus pins. 

• The D3- Dq pins are used to select one of the four functions. 

• The PORT input pin is asserted, as in a regular write cycle. 

NOTE. 

The SCP Dump and Self-Test addresses must be 16-byte aligned. 

The 82596 requires two 16-bit write cycles for a port co mmand. The first write holds the internal machines and 
reads the first 16 bits; the second activates the PORT command and reads the second 1,6 bits. 



The PORT Reset is useful when only the 82596 needs to be reset. The CPU must wait for 1 0-system and 5-se- 
rial clocks before issuing another CA to the 82596; this new CA begins a new initialization process. 

The Dump function is useful for troubleshooting No Respo nse problems. If the chip is in a No Response state, 
the PORT Dump operation can be executed and a PORT Reset can be used to reinitialize the 82596 without 
disturbing the rest of the system. 

The Self-Test function can be used for board testing; the 82596 will execute a self-test and write the results to 
memory. 

Table 2. PORT Function Selection 



Function 


D31. ...D4 DO 


Addresses and Results 


D3 


D2 


D1 


DO 


Reset 


A31 Don't Care A4 














Self-Test 


A31 Self-Test Results Address A4 











1 


SCP 


A31 Alternative SCP Address A4 








1 





Dump 


A31 Dump Area Pointer A4 








1 


1 



MEMORY ADDRESSING FORMATS 

The 82596 accesses memory by 32-bit addresses. There are two types of 32-bit addresses: linear and seg- 
mented. The type of address used depends on the 82596 operating mode and the type of memory structure it 
is addressing. The 82596 has three operating modes. 



4-82 



82596CA 



[?mo[M«F 



« 82586 Mode 

o A Linear address is a single 24-bit entity. Address pins A31 -A24 are always zero. 

o A Segmented address uses a 24-bit base and a 1 6-bit offset, 
o 32-bit Segmented Mode 

A Linear address is a single 32-bit entity. 

o A Segmented address uses a 32-bit base and a 16-bit offset. 

NOTE: 

In the previous two memory addressing modes, each command header (CB, TBD, RFD, RBD, and SCB) 
must wholly reside within one segment. If the 82596 encounters a memory structure that does not follow this 
restriction, the 82596 will fetch the next contiguous location in memory (beyond the segment). 

o Linear Mode 

o A Linear address is a single 32-bit entity. 

o There are no Segmented addresses. 

Linear addresses are primarily used to address transmit and receive data buffers; In the 82586 and 32-bit 
Segmented modes, segmented addresses (base plus offset) are used for all Command Blocks, Buffer Descrip- 
tors, Frame Descriptors, and System Control Blocks. When using Segmented addresses, only the offset 
portion of the entity being addressed is specified in the block. The base for all offsets is the same — that of the 
SCB. See Table 1. 

LITTLE ENDIAN AND BIG ENDIAM BYTE ORDERING 

The 82596 supports both Little Endian and Big Endian byte ordering for its memory structures. 

The 82596 A1 stepping supports Big Endian byte ordering for word and byte entities. Dword entities are not 
supported with 82596 A1 Big Endian byte ordering. This results in slightly different 82596A1 memory struc- 
tures for Big Endian operation. These structures are defined in the 32 LAN Components Users Manual. 

The 82596 B stepping supports Big Endian byte ordering for Linear mode only. All 82596 B 32-bit address 
pointers are treated as 32-bit Big Endian entities, however, the SCB absolute address and statistical counters 
are treated as two 16-bit Big Endian entities. This 32-bit Big Endian entity support is configured through bit 7 in 
the SYSBUS byte. 

NOTE: 

All 82596 memory entities must be word or dword aligned, except the transmit buffers can be byte aligned 
for the 82596 B-Stepping. 

An example of a dword entity is a frame descriptor command/status dword, whereas the raw data of the frame 
are byte entities. Both 32- and 16-bit buses are supported. When a 16-bit bus is used with Big Endian memory 
organization, data lines D15-D0 are used. The 82596 has an internal crossover that handles these swap 
operations. 

COMMAND UNIT (CU) 

The Command Unit is the logical unit that executes Action Commands from a list of commands very similar to 
a CPU program. A Command Block is associated with each Action Command. The CU is modeled as a logical 
machine that takes s at any given time, one of the following states. 

© Idle. The CU is not executing a command and is not associated with a CB on the list. This is the initial state. 

• Suspended. The CU is not executing a command; however, it is associated with a CB on the list. 

© Active. The CU is executing an Action Command and pointing to its CB. 
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The CPU can affect CU operation in two ways: by issuing a CU Control Command or by setting bits in the 
Command word of the Action Command. , 

RECEIVE UNIT (RU) 

The Receive Unit is the logical unit that receives frames and stores them in memory. The RU is modeled as a 
logical machine that takes, at any given time, one of the following states. 

• Idle. The RU has no memory resources and is discarding incoming frames. This is the initial state. 

• No Resources. The RU has no memory resources and is discarding incoming frames. This state differs 
from Idle in that the RU accumulates statistics on the number of discarded frames. 

• Suspended. The RU has memory available for storing frames, but is discarding them. The suspend state 
can only be reached if the CPU forces this through the SCB or sets the suspend bit in the RFD. 

• Ready. The RU has memory available and is storing incoming frames. 

The CPU can affect RU operation in three ways: by issuing an RU Control Command, by setting bits in the 
Frame Descriptor Command word of the frame being received, or by setting the EL bit of the current buffer's 
Buffer Descriptor. 

SYSTEM CONTROL BLOCK (SCB) 

The SCB is a memory block that plays a major role in communications between the CPU and the 82596. Such 
communications include the following. 

• Commands issued by the CPU 

• Status reported by the 82596 

Control commands are sent to the 82596 by writing them into the SCB and then asserting CA. The 82596 
examines the command, performs the required action, and then clears the SCB command word. Control 
commands perform the following types of tasks. 

• Operation of the Command Unit (CU). The SCB controls the CU by specifying the address of the Command 
Block List (CBL) and by starting, suspending, resuming, or aborting execution of CBL commands. 

© Operation of the Bus Throttle. The SCB controls the Bus Throttle timers by providing them with new values 
and sending the Load and Start timer commands. The timers can be operated in both the 32-bit Segmented 
and Linear modes. 

• Reception of frames by the Receive Unit (RU). The SCB controls the RU by specifying the address of the 
Receive Frame Area and by starting, suspending, resuming, or aborting frame reception. 

• Acknowledgment of events that cause interrupts. 

• Resetting the chip. 

The 82596 sends status reports to the CPU via the System Control Block. The SCB contains four types of 
status reports. 

• The cause of the current interrupts. These interrupts are caused by one or more of the following 82596 
events. 

• The Command Unit completes an Action Command that has its I bit set. 
o The Receive Unit receives a frame. 

• The Command Unit becomes inactive. 

• The Receive Unit becomes not ready. 

• The status of the Command Unit. 

• The status of the Receive Unit. 

• Status reports from the 82596 regarding reception of corrupted frames. 
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Events can be cleared only by CPU acknowledgment. If some events are not acknowledged by the ACK field 
the Interrupt signal (INT) will be reissued after Channel Attention (CA) is processed. Furthermore, if a new 
event occurs while an interrupt is set, the interrupt is temporarily cleared to trigger edge-triggered interrupt 
controllers. 

The CPU uses the Channel Attention line to cause the 82596 to examine the SCB. This signal is trailing-edge 
triggered — the 82596 latches CA on the trailing edge. The latch is cleared by the 82596 before the SCB 
control command is read. 
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Figure 18. SCB— 82586 Mode 
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Figure 19. SCB— 32-Bit Segmented Mode 
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Figure 20. SCB— Linear Mode 
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These bits specify the action to be performed as a result of a CA. This word is set by the CPU and cleared by 
the 82596. Defined bits are: 

Bit 31 ACK-CX — Acknowledges that the CU completed an Action Command. 

Bit 30 ACK-FR — Acknowledges that the RU received a frame. 

Bit29ACK-CNA — Acknowledges that the Command Unit became not active. 

Bit28ACK-RNR — Acknowledges that the Receive Unit became not ready. 

Bits 24-26 CUC — (3 bits) This field contains the command to the Command Unit. Valid values are: 

— NOP (does not affect current state of the unit). 

1 — Start execution of the first command on the CBL. If a command is executing, 

complete it before starting the new CBL The beginning of the CBL is in CBL 
OFFSET (address). 

2 — Resume the operation of the Command Unit by executing the next command. 

This operation assumes that the Command Unit has been previously sus- 
pended. 

3 — Suspend execution of commands on CBL after current command is complete. 

4 — Abort current command immediately. 

5 — Loads the Bus Throttle timers so they will be initialized with their new values 

after the active timer (T-ON or T-OFF) reaches Terminal Count. If no timer is 
active new values will be loaded immediately. This command is not valid in 
82586 mode. 

6 — Loads and immediately restarts the Bus Throttle timers with their new values. 

This command is not valid in 82586 mode. 

7 — Reserved. 

— Reset chip (logically the same as hardware RESET). 

— (3 bits) This field contains the command to the Receive Unit. Valid values are: 

— NOP (does not alter current state of unit). 

1 — Start reception of frames. The beginning of the RFA is contained in the RFA 
OFFSET (address). If a frame is being received complete reception before 
starting. 

2 — Resume frame reception (only when in suspended state). 

3 — Suspend frame reception. If a frame is being received complete its reception 
before suspending. 

4 — Abort receiver operation immediately. 
5-7 — Reserved. 



Bit 23 RESET 
Bits 20-22 RUC 
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32-Bit Segmented and Linear mode. 

Indicates the status of the 82596. This word is modified only by the 82596. Defined bits are: 

Bit15CX 

Bit14FR 

Bit13CNA 

Bit12RNR 

Bits 8-10 CUS 



Bits 4-7 RUS 



Bit3T 



— The CU finished executing a command with its / (interrupt) bit set. 

— The RU finished receiving a frame. 

— The Command Unit left the Active state. 

— The Receive Unit left the Ready state. 

— (3 bits) This field contains the status of the command unit. Valid values are: 

— Idle 

1 — Suspended 

2 — Active 
3-7— Not used 

— This field contains the status of the receive unit. Valid values are: 
Oh (0000) — Idle 

1h(0001) —Suspended 

2h(0010) —No Resources. This bit indicates both no resources due to lack of 
RFDs in the RDL and no resources due to lack of RBDs in the FBL 

4h(0100) —Ready 

Ah (1010) — No resources due to no more RBDs (not in the 82586 mode). 

Ch (1100) — No more RBDs (not in 82586 mode) 

No other combinations are allowed 

— Bus Throttle timers loaded (not in 82586 mode). 




SCB OFFSET ADDRESSES 



CBL Offset (Address) 

In 82586 and 32-bit Segmented modes this 16-bit quantity indicates the offset portion of the address for the 
first Command Block on the CBL. In Linear mode it is a 32-bit linear address for the first Command Block on 
the CBL. It is accessed only if CUC equals Start. 

RFA Offset (Address) 

In 82586 and 32-bit Segmented modes this 16-bit quantity indicates the offset portion of the address for the 
Receive Frame Area. In Linear mode it is a 32-bit linear address for the Receive Frame Area. It is accessed 
only if RUC equals Start. 
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SCB STATISTICAL COUNTERS 

Statistical Counter Operation 

© The CPU is responsible for clearing all error counters before initializing the 82596. The 82596 updates 
these counters by reading them, adding 1 , and then writing them back to the SCB. 

© The counters are wraparound counters. After reaching FFFFFFFFh the counters wrap around to zero. 

© The 82596 updates the required counters for each frame. It is possible for more than one counter to be 
updated; multiple errors will result in all affected counters being updated. 

© The 82596 executes the read-counter/increment/write-counter operation without relinquishing the bus 
(locked operation). This is to ensure that no logical contention exists between the 82596 and the CPU due 
to both attempting to write to the counters simultaneo usly. In the dual-port memory configuration the CPU 
should not execute any write operation to a counter if LOCK is asserted. 

© The counters are 32-bits wide and their behavior is fully compatible with the IEEE 802.3 standard. The 
82596 supports all relevant statistics (mandatory, optional, and desired) through the status of the transmit 
and receive header and directly through SCB statistics. 

CRCERRS 

This 32-bit quantity contains the number of aligned frames discarded because of a CRC error. This counter is 
updated, if needed, regardless of the RU state. 

ALNERRS 

This 32-bit quantity contains the number of frames that both are misaligned (i.e., where CRS deasserts on a 
nonoctet boundary) and contain a CRC error. The counter is updated, if needed, regardless of the RU state. 

SHRTFRM 

This 32-bit quantity contains the number of received frames shorter than the minimum frame length. 
The last three counters change function in monitor mode. 

RSCERRS 

This 32-bit quantity contains the number of good frames discarded because there were no resources to 
contain them. Frames intended for a host whose RU is in the No Receive Resources state, fall into this 
category. This counter is updated only if the RU is in the No Resources state. When in Monitor mode this 
counter counts the total number of frames— good and bad. > 
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OVRNERRS 

This 32-bit quantity contains the number of frames known to be lost because the local system bus was not 
available. If the traffic problem lasts longer than the duration of one frame, the frames that follow the first are 
lost without an indicator, and they are not counted. This counter is updated, if needed, regardless of the RU 
state. 

RCVCDT 

This 32-bit quantity contains the number of collisions detected during frame reception. In Monitor mode this 
counter counts the total number of good frames. 

ACTION COMMANDS AND OPERATING MODES 

This section lists all the Action Commands of the Command Unit Command Block List (CBL). Each command 
contains the Command field, the Status and Control fields, the link to the next Action Command, and any 
command-specific parameters. There are three basic types of action commands: 82596 Configuration and 
Setup, Transmission, and Diagnostics. The following is a list of the actual commands. 

° NOP o Transmit 

o Individual Address Setup ° TDR 

o Configure ° Dump 

o MC Setup o Diagnose 

The 82596 has three addressing modes. In the 82586 mode all the Action Commands look exactly like those fi^'^M 
of the 82586. 1&? 

© 82586 Mode. The 82596 software and memory structure is compatible with the 82586. 

° 32-Bit Segmented Mode. The 82596 can access the entire system memory and use the two new memory 
structures — Simplified and Flexible — while still using the segmented approach. This does not require any 
significant changes to existing software. 

o Linear Mode. The 82596 operates in a flat, linear, 4 gigabyte memory space without segmentation. It can 
also use the two new memory structures. 

In the 32-bit Segmented mode there are some differences between the 82596 and 82586 action commands, 
mainly in programming and activating new 82596 features. Those bits marked "don't care" in the compatible 
mode are not checked; however, we strongly recommend that those bits all be zeroes; this will allow future 
enchancements and extensions. 

In the Linear mode all of the address offsets become 32-bit address pointers. All new 82596 features are 
accessible in this mode, and all bits previously marked "don't care" must be zeroes. 

The Action Commands, and all other 82596 memory structures, must begin on even byte boundaries, i.e., they 
must be word aligned. 
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NOP 

This command results in no action by the 82596 except for those performed in the normal command process- 
ing. It is used to manipulate the CBL manipulation. The format of the NOP command is shown in Figure 21. 



NOP— 82586 and 32-Bit Segmented Modes 



31 






ODD WORD 


16 15 






EVEN WORD 


EL 


s 


I 


xxxxxxxxxx 





C 


B 


OK 


0000 00000000 


xxxxxxxxxxxx. xx-xx 


A15 LINK OFFSET A0 



NOP— Linear Mode 



31 






ODD WORD 


16 15 






EVEN WORD 


EL 


S 


I 


00 00000000 


00 


C 


B 


OK 





A31 LINK ADDRESS A0 



Figure 21 



where: 

LINK POINTER 



• In the 82586 or 32-bit Segmented modes this is a 16-bit offset to the next Command 
Block. In the Linear mode this is the 32-bit address of the next Command Block. 

EL — If set, this bit indicates that this command block is the last on the CBL. 

S — If set to one, suspend the CU upon completion of this CB. 

I — If set to one, the 82596 will generate an interrupt after execution of the command is 

complete. If I is not set to one, the CX bit will not be set. 

CMD (bits 16-18) — The NOP command. Value: Oh. 

Bits 19-28 — Reserved (zero in the 32-bit Segmented and Linear modes). 

C — This bit indicates the execution status of the command. The CPU initially resets it to zero 

when the Command Block is placed on the CBL. Following a command Completion, the 
82596 will set it to one. 

B — This bit indicates that the 82596 is currently executing the NOP command. It is initially 

reset to zero by the CPU. The 82596 sets it to one when execution begins and to zero 
when execution is completed. This bit is also set when the 82596 prefetches the com- 
mand. 

NOTE: 

The C and B bits are modified in one operation. 

OK — Indicates that the command was executed without error. If set to one no error occurred 

(command executed OK). If zero an error occured. 

Individual Address Setup 

This command is used to load the 82596 with the Individual Address. This address is used by the 82596 for 
inserting the Source Address during transmission and recognizing the Destination Address during reception. 
After RESET, and prior to Individual Address Setup Command execution, the 82596 assumes the Broadcast 
Address is the Individual Address in all aspects, i.e.: 

• This will be the Individual Address Match reference. 

• This will be the Source Address of a transmitted frame (for AL-LOC = mode only). 
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The format of the Individual Address Setup command is shown in Figure 22. 



31. 



IA Setup— 82586 and 32-Bit Segmented Modes 

ODD WORD 16 15 EVEN WORD 



EL 


S 


I 


XXXXXXXXXX 


1 


C 


B 


OK 


A 


000000000000 


INDIVIDUAL ADDRESS 1st byte 


A15 LINK OFFSET AO 


6th byte 5th byte 


4th byte 3rd byte 



31 








ODD WORD 


IA Setup — Linear Mode 

16 15 


EVEN WORD 







EL 


S 


I 











1 


C 


B 


OK 


A 











A31 




LINK ADDRESS 




AO 


4th byte 




3rd byte 


INDIVIDUAL ADDRESS 


1 st byte 






6th byte 


5th byte 





where: 

LINK ADDRESS, 
EL, B, C, I, S 

A 



Figure 22 

— As per standard Command Block (see the NOP command for details) 

— Indicates that the command was abnormally terminated due to CU Abort control 
command. If one, then the command was aborted, and if necessary it should be 
repeated. If this bit is zero, the command was not aborted. 

— Reserved (zero in the 32-bit Segmented and Linear modes). 

— The Address Setup command. Value: 1 h. 



Bits 19-28 

CMD (bits 16-18) 

INDIVIDUAL ADDRESS — The individual address of the node, to 6 bytes long. 

The least significant bit of the Individual Address must be zero for Ethernet (see the Command Structure). 
However, no enforcement of is provided by the 82596. Thus, an Individual Address with 1 as its least 
significant bit is a valid Individual Address in all aspects. 

The default address length is 6 bytes long, as in 802.3. If a different length is used the IA Setup command 
should be executed after the Configure command. 

Configure 

The Configure command loads the 82596 with its operating parameters. It allows changing some of the 
parameters by specifying a byte count less than the maximum number of configuration bytes (1 1 in the 82586 
mode, 14 in the 32-Bit Segmented and Linear modes). The 82596 configuration depends on its mode of 
operation. When configuring the 12th byte (Byte 11 undefined) in 82586 mode this byte should be all ones. 

o In the 82586 mode the maximum number of configuration bytes is 12. Any number larger than 12 will be 
reduced to 12 and any number less than 4 will be increased to 4. 

9 The additional features of the serial side are disabled in the 82586 mode. 

o In both the 32-Bit Segmented and Linear modes there are four additional configuration bytes, which hold 
parameters for additional 82596 features. If these parameters are not accessed, the 82596 will follow their 
default values. 

o For more detailed information refer to the 32-Bit LAN Components User's Manual. 
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The format of the Configure command is shown in Figure 23, 24 and 25. 





31 ODD WORD 16 15 EVEN WORD 











EL 


S 


I 


xxxxxxxxxx 


1 


C 


B 


OK 


A 













4 

8 

12 

16 


Byte 1 


ByteO 


A15 LINK OFFSET 




AO 


Byte 5 


Byte 4 


Byte 3 


Byte 2 


Byte 9 


Byte 8 


Byte 7 


Byte 6 


XXXXXXXXXXXXXXXX 


X X X X X X X X 


Byte 10 













Figure 23. CONFIGURE— 82586 Mode 





31 


ODD WORD 




16 15 


EVEN WORD 











EL 


S 


I 











1 


C 


B 


OK 


A 
















4 

8 

12 

16 


Byte i 


ByteO 


A15 


LINK OFFSET 




AO 


Byte 5 


Byte 4 


Byte 3 


Byte 2 


Byte 9 


Byte 8 


Byte 7 


Byte 6 


Byte 13 


Byte 12 


Byte 11 


Byte 10 





















Figure 24. CONFIGURE— 32-Bit Segmented Mode 





31 ODD WORD 16 15 EVEN WORD 











EL 


S 


I 


000000 000 


1 


C 


B 


OK 


A 













4 

8 

12 

16 

20 


A31 LINK ADDRESS 




AO 


Byte 3 


Byte 2 


Bytel 


ByteO 


Byte 7 


Byte 6 


Byte 5 


Byte 4 


Byte 11 


Byte 10 


Byte 9 


Byte 8 


XXXXXXXXXXXXXXXX 


Byte 13 


Byte 12 













Figure 25. CONFIGURE— Linear Mode 

LINK ADDRESS, — As per standard Command Block (see the NOP command for details) 
EL, B,C, l,S 

A — Indicates that the command was abnormally terminated due to a CU Abort control com- 

mand. If 1 , then the command was aborted and if necessary it should be repeated. If this 
bit is 0, the command was not aborted. 

Bits 19-28 — Reserved (zero in the 32-Bit Segmented and Linear Modes) 

CMD (bits 16-18) — The CONFIGURE command. Value: 2h. 



The interpretation of the fields follows: 

7 6 5 4 



p 


X 


X 


X 


BYTE COUNT 
i i i 



BYTEO 
BYTE CNT (Bits 0-3) 

PREFETCHED (Bit 7) 



Byte Count. Number of bytes, including this one, that hold pa- 
rameters to be configured. 

Enable the 82596 to write the prefetched bit in all prefetch 
RBDs. 
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NOTE: 

The P bit is valid only in the new memory structure modes. In 82586 mode this bit is disabled (i.e., no 
prefetched mark). 



MONITOR 


X 


X 


FIFO LIMIT 
i i i 



BYTE 1 
FIFO Limit (Bits 0-3) 
MONITOR# (Bits 6-7) 

DEFAULT: C8h 



FIFO limit. 

Receive monitor options. If the Byte Count of the configure 

command is less than 12 bytes then these Monitor bits are ignored. 



7 



















SAVBF 


1 














RESUME_RD 






BYTE 2 
SAV BF (Bit 7) 

DEFAULT: 40h 
RESUME_RD(Bit1) 



— Received bad frames are not saved in the memory. 
1 — Received bad frames are saved in the memory. 

—-The 82596 does not reread the next CB on the list when a CU Resume 

Control Command is issued. 

1 —The 82596 will reread the next CB on the list when a CU Resume 

Control Command is issued. This is available only on the 82596B step- 
ping. 



7 









LOOP BACK 
MODE 


i 
PREAMBLE LENGTH 


NO SRC 
ADD INS 


■ i 

ADDRESS LENGTH 

i i 



BYTE 3 
ADR LEN (Bits 0-2) 
NO SCR ADD INS (Bit 3) 

PREAM LEN (Bits 4-5) 
LP BCK MODE (Bits 6-7) 
DEFAULT: 26h 



Address length (any kind). 

No Source Address Insertion. 

In the 82586 this bit is called AL LOC. 

Preamble length. 

Loopback mode. 



7 









BOF METD 


EXPONENTIAL PRIORITY 
i i 





LINEAR PRIORITY 



BYTE 4 
LIN PRIO (Bits 0-2) 
EXP PRIO (Bits 4-6) 
BOF METD (Bit 7) 
DEFAULT: OOh 



Linear Priority. 
Exponential Priority. 
Exponential Backoff method. 



INTER FRAME SPACING 



BYTE 5 
INTERFRAME SPACING 
DEFAULT: 60h 



Interframe spacing. 
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SLOT TIME -LOW 



BYTE 6 
SLOT TIME (L) 
DEFAULT: OOh 



Slot time, low byte. 



MAXIMUM RETRY NUMBER 
i i i 





SLOT TIME -HIGH 



BYTE 7 

SLOT TIME (H) 
(Bits 0-2) 

RETRY NUM (Bits 4-7) 

DEFAULT: F2h 

7 



Slot time, high part. 



Number of transmission retries on collision. 



PAD 


BIT 
STUFF 


CRC16/ 
CRC32 


NOCRC 
INSER 


TONO 
CRS 


MAN/ 
; NRZ 


BC 
DIS 


PRM 
MODE 



BYTE 8 
PRM (Bit 0) 
BCDIS(Bitl) 
MANCH/NRZ(Bit2) 

TONO CRS (Bit 3) 
NOCRC INS (Bit 4) 
CRC-16/CRC-32(Bit5) 
BIT STF (Bit6) 
PAD (Bit 7) 
DEFAULT: OOh 



Promiscuous mode. 

Broadcast disable. 

Manchester or NRZ encoding. See specific timing require- 
ments for TXC in Manchester mode. 

Transmit on no CRS. 

No CRC insertion. 

CRCtype. 

Bit stuffing. 

Padding. 



CARRIER SENSE FILTER 



CDT SRC 



COLLISION DETECT FILTER 



CRS SRC 



BYTE 9 
CRSF (Bits 0-2) 
CRS SRC (Bit 3) 
CDTF (Bits 4-6) 
CDT SRC (Bit 7) 
DEFAULT: OOh 



Carrier Sense filter (length). 
Carrier Sense source. 
Collision Detect filter (length). 
Collision Detect source. 
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MINIMUM FRAME LENGTH 



BYTE 10 
MIN FRAME LEN 
DEFAULT: 40h 



BYTE 1 1 
PRECRS (Bit 0) 
LNGFLD(Bltl) 
CRCINM(Bit2) 
AUTOTX (Bit 3) 
CDBSAC (Bit 4) 
MC_ALL (Bit 5) 
MONITOR (Bits 6-7) 
DEFAULT: FFH 



BYTE 12 
FDX (Bit 6) 
DEFAULT: OOh 



Minimum frame length. 



7 















MONITOR 


MC__ALL 


CDBSAC 


AUTOTX 


CRCINM 


LNGFLD 


PRECRS 



BYTE 13 
MULT_JA (Bit 6) 
DIS__BOF (Bit 7) 
DEFAULT: 3Fh 



Preamble until Carrier Sense 

Length field. Enables padding at the End-of-Carrier framing (802.3). 

Rx CRC appended to the frame in memory. 

Auto retransmit when a collision occurs during the preamble. 

Collision Detect by source address recognition. 

Enable to receive all MC frames. 

Receive monitor options. 



7 
























FDX 





















Enables Full Duplex operation. 




7 



















DIS_BOF 


MULT_IA 


1 


1 


1 


1 


1 


1 



Multiple individual address. 
Disable the backoff algorithm. 
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A reset (hardware or software) configures the 82596 according to the following defaults. 

Table 4. Configuration Defaults 



Parameter 


Default Value 


Units/Meaning 


ADDRESS LENGTH 


**6 


Bytes 


A/L FIELD LOCATION 





Located in FD 


* AUTO RETRANSMIT 


1 


Auto Retransmit Enable 


BITSTUFFING/EOC 





EOC 


BROADCAST DISABLE 





Broadcast Reception Enabled 


* CDBSAC 


1 


Disabled 


CDT FILTER 





Bit Times 


CDT SRC 





External Collision Detection 


* CRC IN MEMORY 


1 


CRC Not Transferred to Memory 


CRC-16/CRC-32 


**o 


CRC-32 


CRS FILTER 





Bit Times 


CRS SRC 





External CRS 


* DISBOF 





Backoff Enabled 


EXT LOOPBACK 





Disabled 


EXPONENTIAL PRIORITY 


**0 


802.3 Algorithm 


EXPONENTIAL BACKOFF METHOD 


**0 


802.3 Algorithm 


* FULL DUPLEX (FDX) 


.0 


CSMA/CD Protocol (No FDX) 


FIFO THRESHOLD 


8 


TX: 32 Bytes, RX: 64 Bytes 


INT LOOPBACK 





Disabled 


INTERFRAME SPACING 


**96 


Bit Times 


LINEAR PRIORITY 


** 


802.3 Algorithm 


* LENGTH FIELD 


1 


Padding Disabled 


MIN FRAME LENGTH 


**64 


Bytes 


* MC ALL 


1 


Disabled 


* MONITOR 


11 


Disabled 


MANCHESTER/NRZ 





NRZ 


* MULTI IA 





Disabled 


NUMBER OF RETRIES 


**15 


Maximum Number of Retries 


NO CRC INSERTION 





CRC Appended to Frame 


PREFETCH BIT IN RBD 





Disabled (Valid Only in New Modes) 


PREAMBLE LENGTH 


**-7 


Bytes 


* Preamble Until CRS 


1 


Disabled 


PROMISCUOUS MODE 





Address Filter On 


PADDING 





No Padding 


SLOT TIME 


**512 


Bit Times 


SAVE BAD FRAME 





Discards Bad Frames 


TRANSMIT ON NO CRS 





Disabled 



NOTES: 

1 . This configuration setup is compatible with the IEEE 802.3 specification. 

2. The Asterisk "*" signifies a new configuration parameter not available in the 82586. 

3. The default value of the Auto retransmit configuration parameter is enabled* 1 ). 

4. Double Asterisk "**" signifies IEEE 802.3 requirements. 
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Multicast-Setup 

This command is used to load the 82596 with the Multicast-IDs that should be accepted. As noted previously, 
the filtering done on the Multicast-IDs is not perfect and some unwanted frames may be accepted. This 
command resets the current filter and reloads it with the specified Multicast-IDs. The format of the Multicast- 
addresses setup command is: 



31 






ODD WORD 




16 15 








EVEN WORD 





EL 


S 


I 


XXXXXXXX 


X X 


1 1 


C 


B 


OK 


A 


00000000 





X 


X 


MC COUNT 


A15 


LINK OFFSET 


AO 


4th byte 


1st byte 


Nth byte 




MULTICAST ADDRESSES LIST 





Figure 26. MC Setup— 82586 and 32-Bit Segmented Modes 



31 






ODD WORD 




16 15 










EVEN WORD 





EL 


S 


I 


0000000 





1 1 


C 


B 


OK 


A 





00000000000 


A31 




LINK ADDRESS 




AO 


2nd byte 




1 st byte 


X 


X 


MC COUNT 


Nth byte 




MULTICAST ADDRESSES LIST 







where: 

LINK ADDRESS, 
EL, B, C, I, S 

A 



Bits 19-28 
CMD (bits 16-18) 
MC-CNT 



MC LIST 



Figure 27. MC Setup — Linear Mode 



— As per standard Command Block (see the NOP command for details) 

— Indicates that the command was abnormally terminated due to a CU Abort control 
command. If one, then the command was aborted and if necessary it should be 
repeated. If this bit is zero, the command was not aborted. 

— Reserved (0 in both the 32-Bit Segmented and Linear Modes). 

— The MC SETUP command value: 3h. 

This 14-bit field indicates the number of bytes in the MC LIST field. The MC CNT 
must be a multiple of the ADDR LEN; otherwise, the 82596 reduces the MC CNT to 
the nearest ADDR LEN multiple. MC CNT = implies resetting the Hash table 
which is equivalent to disabling the Multicast filtering mechanism. 

— A list of Multicast Addresses to be accepted by the 82596. The least significant bit 
of each MC address must be 1 . 



NOTE: 

The list is sequential; i.e., the most significant byte of an address is immediately followed by the least signifi- 
cant byte of the next address. 

— When the 82596 is configured to recognize multiple Individual Address (Multi-IA), 
the MC-Setup command is also used to set up the Hash table for the individual 
address. 
The least significant bit in the first byte of each IA address must be 0. 
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Transmit 

This command is used to transmit a frame of user data onto the serial link. The format of a Transmit command 
is as follows. 



31 






ODD WORD 


16 15 




EVEN WORD 





EL 


S 


I 


XXXXXXXXXX 


1 


C 


B 


STATUS BITS 


MAXCOLL 


A15 TBD OFFSET 


AO 


A15 LINK OFFSET 


AO 


4th byte 


DESTINATION ADDRESS 


1 st byte 


LENGTH FIELD 


6th byte 



12 



Figure 28. TRANSMIT— 82586 Mode 



31 










ODD WORD 








16.15 




EVEN WORD 


EL 


S 


I 





| 

















NC 


SF 


1 


C 


B 


STATUS BITS 


MAXCOLL 


A15 TBD OFFSET AO 


A15 LINK OFFSET AO 


000000000000000 


EOF 





TCB COUNT 


4th byte 


DESTINATION ADDRESS 1st byte 


LENGTH FIELD 


6th byte 


OPTIONAL DATA 



Figure 29. TRANSM IT— 32-Bit Segmented Mode 



31 



ODD WORD 



16 15 



EVEN WORD 



EL 


S 


I 


























NC 


SF 


1 


C 


B 


STATUS BITS 


MAXCOLL 


A31 LINK ADDRESS AO 


A31 TRANSMIT BUFFER DESCRIPTOR ADDRESS AO 


00 00000 


EOF 





TCB COUNT 


4th byte 


DESTINATION ADDRESS 1st byte 


LENGTH FIELD 


6th byte 


OPTIONAL DATA 



Figure 30. TRANSMIT— Linear Mode 



31 COMMAND WORD 16 




EL 


S 


I 


























NC 


SF 


10 


2 


t T 
0: No CRC Insertion disable; when the 0: Simplified 
configure command is configured to the Transn 
not insert the CRC during Transmit B 
transmission the NC bit has no field is all 1 
effect. 1 : Flexible M 
1 : No CRC Insertion enable; when the in a linked 
configure command is configured to 
insert the CRC during transmission 
the CRC will not be inserted when 
NC- 1. 


Mode, all the Tx data is in 

nit Command Block. The 

uffer Descriptor Address 

s. 

ode. Data is in the TCB and 

list of TBDs. 
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where: 
EL, B, C, I, S 
OK (Bit 13) 
A (Bit 12) 

Bits 19-28 
CMD (Bits 16-18) 
Status Bit 1 1 
Status Bit 10 



Status Bit 9 
Status Bit 8 

Status Bit 7 

Status Bit 6 



Status Bit 5 

Status Bit 4 

MAX-COL 
(Bits 3-0) 

LINK OFFSET 

TBD POINTER 



DEST ADDRESS 



LENGTH FIELD 



TCB COUNT 



EOF Bit 



— As per standard Command Block (see the NOP command for details). 

— Error free completion. 

— Indicates that the command was abnormally terminated due to CU Abort control 
command. If 1, then the command was aborted, and if necessary it should be 
repeated. If this bit is 0, the command was not aborted. 

— Reserved (0 in the 32-bit Segmented and Linear modes). 

— The transmit command: 4h. 

— Late collision. A late collision (a collision after the slot time is elapsed) is detected. 

— No Carrier Sense signal during transmission. Carrier Sense signal is monitored 
from the end of Preamble transmission until the end of the Frame Check Sequence 
for TONOCRS= 1 (Transmit On No Carrier Sense mode) it indicates that transmis- 
sion has been executed despite a lack of CRS. For TONOCRS = (Ethernet 
mode), this bit also indicates unsuccessful transmission (transmission stopped 
when lack of Carrier Sense has been detected). 

— Transmission unsuccessful (stopped) due to Loss of CTS. 

— Transmission unsuccessful (stopped) due to DMA Underrun; i.e., the system did 
not supply data for transmission. 

— Transmission Deferred, i.e., transmission was not immediate due to previous link 
activity. 

— Heartbeat Indicator, Indicates that after a previously performed transmission, and 
before the most recently performed transmission, (Interframe Spacing) the CDT 
signal was monitored as active. This indicates that the Ethernet Transceiver Colli- 
sion Detect logic is performing properly. The Heartbeat is monitored during the 
Interframe Spacing period. 

— Transmission attempt was stopped because the number of collisions exceeded the 
maximum allowable number of retries. 

— (Reserved). 

— The number of Collisions experienced during this frame. Max Col = plus S5 = 1 
indicates 16 collisions. 

— As per standard Command Block (see the NOP Command for details) 

— In the 82586 and 32-bit Segmented modes this is the offset of the first Tx Buffer 
Descriptor containing the data to be transmitted. In the Linear mode this is the 32- 
bit address of the first Tx Buffer Descriptor on the list. If the TBD POINTER is all 1s 
it indicates that no TBD is used. 

— Contains the Destination Address of the frame. The least significant bit (MC) indi- 
cates the address type. 

MC = 0: Individual Address. 

MC = 1: Multicast or Broadcast Address. 

If the Destination Address bits are all 1s this is a Broadcast Address. 

— The contents of this 2-byte field are user defined. In 802.3 it contains the length of 
the data field. It is placed in memory in the same order it is transmitted; i.e., most 
significant byte first, least significant byte second. 

— This 14-bit counter indicates the number of bytes that will be transmitted from the 
Transmit Command Block, starting from the third byte after the TCB COUNT field 
(address /7+ 12 in the 32-bit Segmented mode, A/+16 in the Linear mode). The 
TCB COUNT field can be any number of bytes (including an odd byte), this allows 
the user to transmit a frame with a header having an odd number of bytes. The 
TCB COUNT field is not used in the 82586 mode. 

— Indicates that the whole frame is kept in the Transmit Command Block. In the 
Simplified memory model it must be always asserted. 
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The interpretation of what is transmitted depends on the No Source Address insertion configuration bit and the 
memory model being used. 

NOTES: 

I.The Destination Address and the Length Field are sequential. The Length Field immediately follows the 
most significant byte of the Destination Address. 

2. In case the 82596 is configured with No Source Address insertion bit equal to 0, the 82596 inserts its 
configured Source Address in the transmitted frame. 

• In the 82586 mode, or when the Simplified memory model is used, the Destination and Length fields of the 
transmitted frame are taken from the Transmit Command Block. 

• If the FLEXIBLE memory model is used, the Destination and Length fields of the transmitted frame can be 
found either in the TCB or TBD, depending on the TCB COUNT. 

3. If the 82596 is configured with the Address/ Length Field Location equal to 1, the 82596 does not insert its 
configured Source Address in the transmitted frame. The first (2 x Address Length) + 2 bytes of the 
transmitted frame are interpreted as Destination Address, Source Address, and Length fields respectively. 
The location of the first transmitted byte depends on the operational mode of the 82596: 

o In the 82586 mode, it is always the first byte of the first Tx Buffer. 

« In both the 32-bit Segmented and Linear modes it depends on the SF bit and TCB COUNT: 

— In the Simplified memory mode the first transmitted byte is always the third byte after the TCB COUNT 
field. 

— In the Flexible mode, if the TCB COUNT is greater than then it is the third byte after the TCB COUNT 
field. If TCB COUNT equals then it is first byte of the first Tx Buffer. 

• , Transmit frames shorter than six bytes are invalid. The transmission will be aborted (only in 82586 mode) 

because of a DMA Underrun. 

4. Frames which are aborted during transmission are jammed. Such an interruption of transmission can be 
caused by any reason indicated by any of the status bits 8, 9, 10 and 12. 

Jamming Rules 

1. Jamming will not start before completion of preamble transmission. 

2. Collisions detected during transmission of the last 1 1 bits will not result in jamming. 

The format of a Transmit Buffer Descriptor is: 



31 


ODD WORD 


82586 Mode 

16 15 13 


EVEN WORD 







NEXT TBD OFFSET 


EOF 


X 


SIZE (ACT COUNT) 




4 


X X X 


X X X X X 


TRANSMIT BUFFER ADDRESS 


31 


ODD WORD 


32-Bit Segmented Mode 

1615 13 


EVEN WORD 







NEXT TBD OFFSET 


EOF 





SIZE (ACT COUNT) 




4 


TRANSMIT BUFFER ADDRESS 


31 


ODD WORD 


Linear Mode 

16 15 13 


EVEN WORD 



















EOF 





SIZE (ACT COUNT) 



4 
8 


NEXT TBD ADDRESS 


TRANSMIT BUFFER ADDRESS 















Figure 31 
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where: 
EOF 

SIZE (ACT COUNT) 

NEXT TBD ADDRESS 

BUFFER ADDRESS 



— This bit indicates that this TBD is the last one associated with the frame being 
transmitted. It is set by the CPU before transmit. 

— This 14-bit quantity specifies the number of bytes that hold information for the 
current buffer. It is set by the CPU before transmission. 

— In the 82586 and 32-bit Segmented modes, it is the offset of the next TBD on the 
list. In the Linear mode this is the 32-bit address of the next TBD on the list. It is 
meaningless if EOF= 1. 

— The starting address of the memory area that contains the data to be sent. In the 
82586 mode, this is a 24-bit address (A31 -A24 are considered to be zero). In the 
32-bit Segmented and Linear modes this is a 32-bit address. This buffer can be 
byte aligned for the 82596 B step. 



TDR 

This operation activates Time Domain Reflectomet, which is a mechanism to detect open or short circuits on 
the link and their distance from the diagnosing station. The TDR command has no parameters. The TDR 
transmit sequence was changed, compared to the 82586, to form a regular transmission. The TDR bit stream 
is as follows. 

— Preamble 

— Source address 

— Another Source address (the TDR frame is transmitted back to the sending station, 
so DEST ADR = SRC ADR). 

— Data field containing 7Eh patterns. 

— Jam Pattern, which is the inverse CRC of the transmitted frame. 

Maximum length of the TDR frame is 2048 bits. If the 82596 senses collision while transmitting the TDR frame 
it transmits the jam pattern and stops the transmission. The 82596 then triggers an internal timer (STC); the 
timer is reset at the beginning of transmission and reset if CRS is returned. The timer measures the time 
elapsed from the start of transmission until an echo is returned. The echo is indicated by Collision Detect going 
active or a drop in the Carrier Sense signal. The following table lists the possible cases that the 82596 is able 
to analyze. 

Conditions of TDR as Interpreted by the 82596 




Transceiver Type 
Condition 


Ethernet 


Non Ethernet 


Carrier Sense was inactive for 2048-bit-time 
periods 


Short or Open on the 
Transceiver Cable 


NA 


Carrier Sense signal dropped 


Short on the Ethernet cable 


NA 


Collision Detect went active 


Open on the Ethernet cable 


Open on the Serial Link 


The Carrier Sense Signal did not drop or the 
Collision Detect did not go active within 
2048-bit time period 


No Problem 


No Problem 



An Ethernet transceiver is defined as one that returns transmitted data on the receive pair and activates the 
Carrier Sense Signal while transmitting. A Non-Ethernet Transceiver is defined as one that does not do so. 
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The format of the Time Domain Reflectometer command is: 



82586 and 32-Bit Segmented Modes 



31 










ODD WORD 


16 15 






EVEN WORD 


EL 


S 


I 


X XXXXXXXXX 


1 1 


C 


B 


OK 


00000000 000 


LNK 
OK 


XVR 
PRB 


ET 
OPN 


ET 
SRT 


X 


TIME 
(11 bits) 


A15 LINK OFFSET AO 



31 



ODD WORD 



Linear Mode 

16 15 



EVEN WORD 



EL 


S 


i|ooooo 00000 


1 1, 


c 


B 


OK 








000000000 


A31 LINK ADDRESS 




A0 


0000000000000000 


LNK 
OK 


XVR 
PRB 


ET 
OPN 


ET 
SRT 


X 


TIME 
(11 bits) 



where: 

LINK ADDRESS, 
EL, B, C, I, S 

A 



Bits 19-28 
CMD (Bits 16-18) 
TIME 



LNK OK (Bit 15) 
XCVRPRB(Bit14) 

ET OPN (Bit 13) 

ET SRT (Bit 12) 



Figure 32. TDR 



— As per standard Command Block (see the NOP command for details). 

— Indicates that the command was abnormally terminated due to CU Abort control 
command. If one, then the command was aborted, and if necessary it should be 
repeated. If this bit is zero, the command was not aborted. 

— Reserved (0 in the 32-bit Segmented and Linear Modes). 

— The TDR command. Value: 5h. 

— An 1 1 -bit field that specifies the number of TxC cycles that elapsed before an echo 
was observed. No echo is indicated by a reception consisting of "1s" only. Be- 
cause the network contains various elements such as transceiver links, transceiv- 
ers, Ethernet, repeaters etc., the TIME is not exactly proportional to the problems 
distance. 

— No link problem identified. TIME = 7FFh. 

— Indicates a Transceiver problem. Carrier Sense was inactive for 2048-bit time peri- 
od. LNK OK =0. TIME =7FFh. 

— The transmission line is not properly terminated. Collision Detect went active and 
LNKOK = 0. 

— There is a short circuit on the transmission line. Carrier Sense Signal dropped and 
LNKOK = 0. 
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DUMP 

This command causes the contents of various 82596 registers to be placed in a memory area specified by the 
user. It is supplied as a 82596 self-diagnostic tool, and to provide registers of interest to the user. The format 
of the DUMP command is: 



82586 and 32-Bit Segmented Modes 



31 






ODD WORD 


16 15 






EVEN WORD 


EL 


S 


I 


XXXXXXXXXX 


1 1 


C 


B 


OK 


0000000000000 


A15 BUFFER OFFSET 


AO 


A15 LINK OFFSET AO 



31 



ODD WORD 



Linear Mode 

16 15 



EVEN WORD 



EL 


S 


I 


XXXXXXXXXX 


1 1 


c 


B 


OK 


0000000000000 


A31 LINK ADDRESS A0 


A31 BUFFER ADDRESS A0 



where: 

LINK ADDRESS, 
EL, B, C, I, S 

OK 

Bits 19-28 

CMD (Bits 16-18) 

BUFFER POINTER 



Figure 33. Dump 



• As per standard Command Block (see the NOP command for details). 

- Indicates error free completion. 

• Reserved (0 in the 32-bit Segmented and Linear Modes). 

■ The Dump command. Value: 6h. 

■ In the 82586 and 32-bit Segmented modes this is the 1 6-bit-off set portion of the 
dump area address. In the Linear mode this is the 32-bit linear address of the dump 
area. 



Dump Area Information Format 

o The 82596 is not Dump compatible with the 82586 because of the 32-bit internal architecture. In 82586 
mode the 82596 will dump the same number of bytes as the 82586. The compatible data will be marked 
with an asterisk. 

o In 82586 mode the dump area is 170 bytes. 

© The DUMP area format of the 32-bit Segmented and Linear modes is described in Figure 35. 

• The size of the dump area of the 32-bit Segmented and Linear modes is 304 bytes. 

• When the Dump is executed by the Port command an extra word will be appended to the Dump Area. The 
extra word is a copy of the Dump Area status word (containing the C, B, and OK Bits). The C and OK Bits 
are set when the 82596 has completed the Port Dump command. 
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15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 


00 
02 
04 
06 
08 
0A 

oc 

0E 
10 
12 
14 
16 
18 
1A 
1C 
1E 
20 
22 
24 
26 
28 
2A 
2C 
2E 
30 

6A 
6C 
6E 

7A 
7C 

82 
84 
86 
88 
8A 
8C 
8E 
90 
92 
94 
96 
98 
9A 
9C 
9E 
A0 
A2 
A4 
A6 
A8 


* The 82596 is not Dump compatible with 1 
the 82586 because of the 32-bit internal ar- 
chitecture. In 82586 mode the 82596 will 
dump the same number of bytes as the 
82586. 

**These bytes are not user defined, results 
may vary from Dump command to Dump 
command. 




DMA CONTROL REGISTER 


CONFIGURE BYTES* 3, 2 


CONFIGURE BYTES* 5, 4 


CONFIGURE BYTES* 7, 6 


CONFIGURE BYTES* 9, 8 


CONFIGURE BYTES* 10 


LA. BYTES 1,0* 


LA. BYTES .3, 2* 


LA. BYTES 5, 4* 


LAST T.X. STATUS* 


T.X.CRC BYTES 1,0* 


T.X. CRC BYTES 3, 2* 


R.X.CRC BYTES 1,0* 


R.X. CRC BYTES 3, 2* 


R.X. TEMP MEMORY 1,0* 


R.X. TEMP MEMORY 3, 2* 


R.X. TEMP MEMORY 5, 4* 


LAST RECEIVED STATUS* 


HASH REGISTER BYTES 1, 0* 


HASH REGISTER BYTES 3, 2* 


HASH REGISTER BYTES 5, 4* 


HASH REGISTER BYTES 7, 6* 


SLOT TIME COUNTER* 


WAIT TIME COUNTER* 


MICRO MACHINE** 

REGISTER FILE 

60 BYTES 


MICRO MACHINE LFSR** 


MICRO MACHINE** 

FLAG ARRAY 

14 BYTES 


QUEUE MEMORY** 

CUPORT 
8 BYTES 


MICRO MACHINE ALU** 


RESERVED** 


M.M. TEMP A ROTATE R** 


M.M.TEMPA** 


T.X. DMA BYTE COUNT** 


M.M. INPUT PORT ADDRESS** 


T.X. DMA ADDRESS 


M.M. OUTPUT PORT** 


R.X. DMA BYTE COUNT** 


M.M. OUTPUT PORT ADDRESS REGISTER** 


R. DMA ADDRESS** 


RESERVED** 


BUS THROTTLE TIMERS 


DIU CONTROL REGISTER** 


RESERVED** 


DMA CONTROL REGISTER** 


BIU CONTROL REGISTER** 


M.M. DISPATCHER REG.** 


M.M. STATUS REGISTER** 









Figure 34. Dump Area Format— 82586 Mode 
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31 


00 
04 
08 
OC 
10 
14 
18 
1C 
20 
24 
28 
2C 
30 
34 

B0 
B4 
B8 

DO 
D4 

E0 

E4 

E8 

EC 

F0 

F4 

F8 

FC 

100 

104 

108 

10C 

110 

114 

118 

11C 

120 

124 

128 

12C 


The 82596 is not Dump compatible with the 
82586 because of the 32-bit internal archi- 
tecture. In 82586 mode the 82596 will dump 
the same number of bytes as the 82586. 
**These bytes are not user defined, results 
may vary from Dump command to Dump 
command. 




CONFIGURE BYTES 5, 4, 3, 2 


CONFIGURE BYTES 9, 8, 7, 6 


CONFIGURE BYTES 13, 12, 11, 10 


I.A. BYTES 1,0 |XXXXXXXX 


LA. BYTES 5, 2 


TX CRC BYTES 0, 1 


LAST T.X. STATUS 


RX CRC BYTES 0, 1 


TX CRC BYTES 3, 2 


RXTEMP MEMORY 1,0 


RX CRC BYTES 3, 2 


R.X. TEMP MEMORY 5, 2 


HASH REGISTERS 1,0 | LAST R.X. STATUS 


HASH REGISTER BYTES 5, 2 


SLOT TIME COUNTER 


HASH REGISTERS 7, 6 


RECEIVE FRAME LENGTH 


WAIT-TIME COUNTER 


MICRO MACHINE** 

REGISTER FILE 

128 BYTES 


MICRO MACHINE LFSR** 


MICRO MACHINE** 

FLAG ARRAY 

28 BYTES 


M.M. INPUT PORT** 
16 BYTES 


MICRO MACHINE ALU** 


RESERVED** 


M.M. TEMP A ROTATE R.* * 


M.M. TEMP A** 


T.X. DMA BYTE COUNT** 


M.M. INPUT PORT ADDRESS REGISTER** 


T.X. DMA ADDRESS** 


M.M. OUTPUT PORT REGISTER** 


R.X. DMA BYTE COUNT** 


M.M. OUTPUT PORT ADDRESS REGISTER** 


R.X. DMA ADDRESS REGISTER** 


RESERVED** 


BUS THROTTLE TIMERS 


DIU CONTROL REGISTER** 


RESERVED** 


DMA CONTROL REGISTER** 


BIU CONTROL REGISTER** 


M.M. DISPATCHER REG.** 


M.M. STATUS REGISTER** 







Figure 35. Dump Area Format— Linear and 32-Bit Segmented Mode 
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Diagnose 

The Diagnose Command triggers an internal self-test procedure that checks internal 82596 hardware, which 
includes: 

• Exponential Backoff Random Number Generator (Linear Feedback Shift Register). 

• Exponential Backoff Timeout Counter. 

• Slot Time Period Counter. 

• Collision Number Counter. 

• Exponential Backoff Shift Register. 

• Exponential Backoff Mask Logic. 

• Timer Trigger Logic. 

This procedure checks the operation of the Backoff block, which resides in the serial side and is not easily 
controlled. The Diagnose command is performed in two phases. 

The format of the 82596 Diagnose command is: 





82586 and 32-Bit Segmented Modes 

31 ODDWORD 16 15 EVENWORD 






EL 


S 


I 


xxxxxxxxxx 


1 1 1 


c 


B 


OK 





F 







xxxxxxxxxxxxxxxx 


A15 LINK OFFSET A0 




Linear Mode 

31 ODDWORD 16 15 EVENWORD 






EL 


S 


I 


00000 00000 


1 1 1 


C 


B 


OK 





F 


00000 00 0000 




A31 LINK ADDRESS . * AO 









Figure 36. Diagnose 



where: 

LINK ADDRESS, 
EL, B, C, I, S 

Bits 19-28 

CMD (bits 16-18) 

OK (bit 13) 

F (bit 11) 



- As per standard Command Block (see the NOP command for details). 

- Reserved (0 in the 32-bit Segmented and Linear Modes). 

- The Diagnose command. Value: 7h. 

- Indicates error free completion. 

- Indicates that the self-test procedure has failed. 
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RECEIVE FRAME DESCRIPTOR 

Each received frame is described by one Receive Frame Descriptor (see Figure 37). Two new memory 
structures are available for the received frames. The structures are available only in the Linear and 32-bit 
Segmented modes. 

Simplified Memory Structure 

The first is the Simplified memory structure, the data section of the received frame is part of the RFD and is 
located immediately after the Length Field. Receive Buffer Descriptors are not used with the Simplified struc- 
ture, it is primarily used to make programming easier. If the length of the data area described in the Size Field 
is smaller than the incoming frame, the following happens. 

1 . The received frame is truncated. 

2. The No Resource error counter is updated. 

3. If the 82596 is configured to Save Bad Frames the RFD is not reused; otherwise, the same RFD is used to 
hold the next received frame, and the only action taken regarding the truncated frame is to update the 
counter. 

4. The 82596 continues to receive the next frame in the next RFD. 



L 




SCB 


r 




RFA 
POINTER 


STATISTICS 



TO 

COMMAND RECEIVE 

BLOCK FRAME 

LIST DESCRIPTORS 



RECEIVE 

BUFFER 

DESCRIPTORS 



RECEIVE 
BUFFERS 



• RECEIVE FRAME AREA - 



VALID 
PARAMETERS 



c 






ACT-cnt 




c 


> 


1 


' 






VALID 
DATA 





VALID 
DATA 



BUFFER 1 BUFFER 2 
— RECEIVE FRAME LIST >\4- 



br 



r 



s_r 



BUFFER 4 
FREE FRAME LIST • 



290218-15 




Figure 37. The Receive Frame Area 
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Note that this sequence is very useful for monitoring. If the 82596 is configured to Save Bad Frames, to 
receive in Promiscuous mode, and to use the Simplified memory structure, any programmed length of received 
data can be saved in memory. 

The Simplified memory structure is shown in Figure 38. 



CBL 
POINTER 



RFA 
POINTER 



TO COMMAND LIST 



■ RECEIVE FRAME AREA 

FD2 FD3 



-*• 



BUS 
THROTTLE 



RECEIVE 

FRAME 

DESCRIPTORS 



VARIABLE 
DATA 
FIELD 



STATUS 



bJ^ 



■ RECEIVE FRAME LIST ■ 



■ FREE FRAME LIST ■ 



290218-16 



Figure 38. RFA Simplified Memory Structure 

Flexible Memory Structure 

The second structure is the Flexible memory structure, the data structure of the received frame is stored in 
both the RFD and in a linked list of Receive Buffers — Receive Buffer Descriptors. The received frame is placed 
in the RFD as configured in the Size field. Any remaining data is placed in a linked list of RBDs. 

The Flexible memory structure is shown in Figure 39, 
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CBL 
POINTER 



RFA 
POINTER 



TO COMMAND LIST 
'< 



■ RECEIVE FRAME AREA - 
FD2 



RECEIVE 

FRAME 

DESCRIPTORS 



CONTROL 
FIELD 



VARIABLE 
DATA 
FIELD 



STATUS 



r 



k______j 



■ •■»•■>■■■ 4 



RECEIVE 

BUFFER 

DESCRIPTORS 



RECEIVE 
BUFFERS 



-r 



r— =j* 



n 



VALID 
DATA 



BUFFER 1 BUFFER 2 
— RECEIVE FRAME LIST 



-><- 



BUFFER 4 
FREE FRAME LIST 




Figure 39. RFA Flexible Memory Structure 

Buffers on the receive side can be different lengths. The 82596 will not place more bytes into a buffer than 
indicated in the associated RBD. The 82596 will fetch the next RBD before it is needed. The 82596 will 
attempt to receive frames as long as the FBL is not exhausted. If there are no more buffers, the 82596 
Receive Unit will enter the No Resources state. Before starting the RU, the CPU must place the FBL pointer in 
the RBD pointer field of the first RFD. All remaining RBD pointer fields for subsequent RFDs should be "1 s." If 
the Receive Frame Descriptor and the associated Receive Buffers are not reused (e.g., the frame is properly 
received or the 82596 is configured to Save Bad Frames), the 82596 writes the address of the next free RBD 
to the RBD pointer field of the next RFD. 

Receive Buffer Descriptor (RBD) 

The RBDs are used to store received data in a flexible set of linked buffers. The portion of the frame's data 
field that is outside the RFD is placed in a set of buffers chained by a sequence of RBDs. The RFD points to 
the first RBD, and the last RBD is flagged with an EOF bit set to 1 . Each buffer in the linked list of buffers 
related to a particular frame can be any size up to 2 14 bytes but must be word aligned (begin on an even 
numbered byte). This ensures optimum use of the memory resources while maintaining low overhead. All 
buffers in a frame are filled with the received data except for the last, in which the actual count can be smaller 
than the allocated buffer space. 
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31 




ODD WORD 16 15 








EVEN WORD 





EL 


S 


XXXXXXXXXXXXXX 


C 


B 


OK 





STATUS BITS 





A15 RBD OFFSET AO 


A15 LINK OFFSET 


AO 


4th byte 


DESTINATION ADDRESS 


1 st byte 


SOURCE ADDRESS 1 st byte 


6th byte 


6th byte 


4th byte 


X X X X X X X XXXXXXXXX 


LENGTH FIELD 





4 

8 

12 

16 

20 



Figure 40. Receive Frame Descriptor — 82586 Mode 



31 




ODD WORD 




16 15 






EVEN WORD 





EL 


S 


000 000000 


SF 





C 


B 


OK 


STATUS BITS 


A15 RBD OFFSET A0 


A15 LINK OFFSET 


A0 








SIZE 


EOF 


F 


ACTUAL COUNT 


4th byte 


DESTINATION ADDRESS 


1 st byte 


SOURCE ADDRESS 1 st byte 


6th byte 


6th byte 


4th byte 




LENGTH FIELD 










OPTIONAL DATA AREA 







4 

8 

12 

16 

20 

24 



Figure 41. Receive Frame Descriptor— -32-Bit Segmented Mode 



31 




ODD WORD 




16 15 






EVEN WORD 


EL 


S 


0000000000 


SF 





C 


B 


OK 


STATUS BITS 


A31 LINK ADDRESS A0 


A31 RECEIVE BUFFER DESCRIPTOR ADDRESS AO 








SIZE 


EOF 


F 


ACTUAL COUNT 


4th byte 


DESTINATION ADDRESS 1 st byte 


SOURCE ADDRESS 1st byte 


6th byte 


6th byte 


4th byte 




LENGTH FIELD 










OPTIONAL 


DATA AREA 





4 

8 

12 

16 

20 

24 

28 



Figure 42. Receive Frame Descriptor— -Linear Mode 
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where: 
EL 
S 
SF 



C 
B 



OK (bit 13) 



STATUS 



LINK ADDRESS 

RBD POINTER 

EOF 
F 

SIZE 

ACT COUNT 

MC 

DESTINATION 
ADDRESS 

SOURCE ADDRESS 
LENGTH FIELD 



- When set, this bit indicates that this RFD is the last one on the RDL. 

- When set, this bit suspends the RU after receiving the frame. 
■ This bit selects between the Simplified or the Flexible mode. 

— Simplified mode, all the RX data is in the RFD. RBD ADDRESS field is all 

"1s." 

1 — Flexible mode. Data is in the RFD and in a linked list of Receive Buffer De- 

scriptors. 

- This bit indicates the completion of frame reception. It is set by the 82596. 

- This bit indicates that the 82596 is currently receiving this frame, or that the 82596 
is ready to receive the frame. It is initially set to by the CPU. The 82596 sets it to 
1 when reception set up begins, and to upon completion. The C and B bits are 
set during the same operation. 

- Frame received successfully, without errors. RFDs with bit 1 3 equal to are possi- 
ble only if the save bad frames, configuration option is selected. Otherwise all 
frames with errors will be discarded, although statistics will be collected on them. 

- The results of the Receive operation. Defined bits are, 
Bit 12: 
Bit 11 
Bit 10: 
Bit 9 
Bit 8: 
Bit 7 
Bit 6: 
Bit 5: 



Length error if configured to check length 

CRC error in an aligned frame 

Alignment error (CRC error in misaligned frame) 

Ran out of buffer space — no resources 

DMA Overrun failure to acquire the system bus. 

Frame too short. 

No EOP flag (for Bit stuffing only) 

When the SF bit equals zero, and the 82596 is configured to save bad 
frames, this bit signals that the receive frame was truncated. Otherwise it 
is zero. 

Bits 2-4: Zeros 

Bit 1 : When it is zero, the destination address of the received frame matches 
the IA address. When it is a 1 , the destination address of the received 
frame did not match the individual address. For example, a multicast 
address or broadcast address will set this bit to a 1. 

Bit 0: Receive collision, a collision is detected during reception. 

■A 16-bit offset (32-bit address. in the Linear mode) to the next Receive Frame 
Descriptor. The Link Address of the last frame can be used to form a cyclical list. 

■ The offset (address in the Linear mode) of the first RBD containing the received 
frame data. An RBD pointer of all ones indicates no RBD. 

■ These fields are for the Simplified and Flexible memory models. They are exactly 
the same as the respective fields in the Receive Buffer Descriptor. See the next 
section for detailed explanation of their functions. 

• Multicast bit. 

■ The contents of the destination address of the receive frame. The field is to 6 
bytes long. 

• The contents of the Source Address field of the received frame. It is to 6 bytes 
long. 

■ The contents of this 2-byte field are user defined. In 802.3 it contains the length of 
the data field. It is placed in memory in the same order it is received, i.e., most 
significant byte first, least significant byte second. 
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NOTES 

1. The Destination address, Source address and Length fields are packed, i.e., one field immediately follows 
the next. 

2. The affect of Address/ Length Location (No Source Address Insertion) configuration parameter while re- 
ceiving is as follows: 

— 82586 Mode: The Destination address, Source address and Length field are not used, they are placed in 
the RX data buffers. 

— 32-Bit Segmented and Linear Modes: when the Simplified memory model is used, the Destination address, 
Source address and Length fields reside in their respective fields in the RFD. When the Flexible memory 
strucrture is used the Destination address, Source address, and Length field locations depend on the SIZE 
field of the RFD. They can be placed in the RFD, in the RX data buffers, or partially in the RFD and the rest 
in the RX data buffers, depending on the SIZE field value. 





82586 Mode 

31 ODD WORD 16 15 


EVEN WORD 









A15 NEXT RBD OFFSET AO EOF 


F. 


ACTUAL COUNT 



4 
8 


X X X X X X X X 


A23 RECEIVE BUFFER ADDRESS 


AO 


XXXXXXXXXXXXXXXX 


EL 


X 


SIZE 




32-Bit Segmented Mode 

31 ODD WORD 16 15 


EVEN WORD 









A15 NEXT RBD OFFSET AO 


EOF 


F 


ACTUAL COUNT 



4 
8 


A31 RECEIVE BUFFER ADDRESS 




AO 


0000000000000000 


EL 


P 


SIZE 




Linear Mode 

31 ODD WORD 16 15 


EVEN WORD 









0000000000000000 


EOF 


F 


ACTUAL COUNT 



4 
8 


A31 NEXT RBD ADDRESS 




AO 


A31 RECEIVE BUFFER ADDRESS 




AO 


0000000000000 00 


EL 


P 


SIZE 













Figure 43. Receive Buffer Descriptor 
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where: 
EOF 



ACT COUNT 

NEXT BD ADDRESS 
BUFFER ADDRESS 

EL 
P 



SIZE 



• Indicates that this is the last buffer related to the frame. It is cleared by the CPU 
before starting the RU, and is written by the 82596 at the end of reception of the 
frame. 

■ Indicates that this buffer has already been used. The Actual Count has no meaning 
unless the F bit equals one. This bit is cleared by the CPU before starting the RU, 
and is set by the 82596 after the associated buffer has been. This bit has the same 
meaning as the Complete bit in the RFD and CB. 

•This 14-bit quantity indicates the number of meaningful bytes in the buffer. It is 
cleared by the CPU before starting the RU, and is written by the 82596 after the 
associated buffer has already been used. In general, after the buffer is full, the 
Actual Count value equals the size field of the same buffer. For the last buffer of 
the frame, Actual Count can be less than the buffer size. 

- The offset (absolute address in the Linear mode) of the next RBD on the list. It is 
meaningless if EL=1. 

• The starting address of the memory area that contains the received data. In the 
82586 mode, this is a 24-bit address (with pins A24-A31 =0). In the 32-bit Seg- 
mented and Linear modes this is a 32-bit address. 

■ Indicates that the buffer associated with this RBD is last in the FBL. 

- This bit indicates that the 82596 has already prefetched the RBDs and any change 
in the RBD data will be ignored. This bit is valid only in the new 82596 memory 
modes, and if this feature has been enabled during configure command. The 
82596 Prefetches the RBDs in locked cycles; after prefetching the RBD the 82596 
performs a write cycle where the P bit is set to one and the rest of the data remains 
unchanged. The CPU is responsible for resetting it in all RBDs. The 82596 will not 
check this bit before setting it. 

■ This 1 4-bit quantity indicates the size, in bytes, of the associated buffer. This quan- 
tity must be an even number. 




4-113 



intel; 



82596GA 



iF^ioisnrav 



PGA PACKAGE THERMAL SPECIFICATION 



Parameter 


Thermal Resistance 


0JC 


3°C/W 


#JA 


24°C/W 



ELECTRICAL AND TIMING 
CHARACTERISTICS 

Absolute Maximum Ratings 

• Storage Temperature -65°C to + 150°C 

• Case Temperature under Bias - 65°C to + 1 1 0°C 

• Supply Voltage 

with Respect to V S s •• • -0.5V to + 6.5V 

« Voltage on Other Pins .... -0.5V to V C c + 0.5V 

DC Characteristics 



T c = 0°C-85°C, Vcc = 5V ±10% LE/BE have MOS levels (see Ymil. V M ih)- 
All other signals have TTL levels (see V|l, Vih, Vol. v Oh)- 




Symbol 


Parameter 


Min 


Max 


Units 


Notes 


VlL 


. Input Low Voltage (TTL) 


-0.3 


+ 0.8 


V 




V| H 


Input High Voltage (TTL) 


2.0 


V CC + 0.3 


V 




Vmil 


Input Low Voltage (MOS) 


-0.3 


+ 0.8 


V 




VmIH 


Input High Voltage (MOS) 


3.7 


Vcc +' 0.3 


.V 




Vol 


Output Low Voltage (TTL) 




0.45 


V 


Iol = 4.0 mA 


Vcil 


RXC, TXC Input Low Voltage 


-0.5 


0.6 


V 




V CIH 


RXC, TXC Input High Voltage 


3.3 


Vcc + 0.5 


V 




V H 


Output High Voltage (TTL) 


2.4 




V 


Ioh = 0.9mA-1 mA 


"LI 


Input Leakage Current 




±15 


jitA 


o <, v, N <; v C c 


lLO 


Output Leakage Current 




±15 


jllA 


0.45 < Vqut < Vcc 


Cin 


Capacitance of Input Buffer 




10 


PF 


FC = 1 MHz 


Gout 


Capacitance of Input/Output 
Buffer 




12 


PF 


FC = 1 MHz 


C CLK 


CLK Capacitance 




20 


PF 


FC = 1 MHz 


'cc 


Power Supply 




200 


mA 


At 25 MHz 

Ice Typical = 100mA 


'cc 


Power Supply 




300 


mA 


At 33 MHz 

Ice Typical = 1 50 mA 
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AC Characteristics 

82596CA INPUT/OUTPUT SYSTEM TIMINGS 

Tc = 0°C-85 o C, Vcc = 5V ±10%. These timing assume the C|_ on all outputs is 50 pF unless otherwise 
specified. Cl can be 20 pF to 120 pF however timings must be derated. All timing requirements are given in 
nanoseconds. 



Symbol 


Parameter 


25 MHz 


Notes 


Min 


Max 




Operating Frequency 


12.5 MHz 


25 MHz 


1X CLK Input 


T1 


CLK Period 


40 


80 




T1a 


CLK Period Stability 




0.1% 


Adjacent CLK A 


T2 


CLK High 


14 




2.0V 


T3 


CLK Low 


14 




0.8V 


T4 


CLK Rise Time 




4 


0.8V to 2.0V 


T5 


CLK Fall Time 




4 


2.0V to 0.8V 


T6 


BEn, LOCK, and A2-A31 Valid Delay 


3 


22 




T6a 


BLAST, PCHK Valid Delay 


3 


27 




T7 


BEn, LOCK, BLAST, A2-A31 Float Delay 


3 


30 




T8 


W/R and ADS Valid Delay 


3 


22 




T9 


W/R and ADS Float Delay 


3 


30 




T10 


D0-D31 , DPn Write Data Valid Delay 


3 


22 




T11 


D0-D31 , DPn Write Data Float Delay 


3 


30 




T12 


HOLD Valid Delay 


3 


22 




T13 


CA and BREQ Setup Time 


7 




1,2 


T14 


CA and BREQ Hold Time 


3 




1,2 


T-15 


BS1 6 Setup Time 


8 




2 


T16 


BS16 Hold Time 


3 




2 


T17 


BRDY, RDY Setup Time 


8 




2 


T18 


BRDY, RDY Hold Time 


3 




2 


T19 


D0-D31, DPn READ Setup Time 


5 




2 


T20 


D0-D31 , DPn READ Hold Time 


3 




2 


T21 


AHOLD and HLDA Setup Time 


10 




1,2 


T22 


AHOLD Hold Time - 


3 




1,2 


T22a 


HLDA Hold Time 


3 




1,2 


T23 


RESET Setup Time 


10 




1,2 


T24 


RESET Hold Time 


3 




1,2 


T25 


INT/INT Valid Delay 


1 


26 




T26 


CA and BREQ, PORT Pulse Width 


2T1 




1,2,3 


T27 


D0-D31 CPU PORT Access Setup Time 


5 




2 


T28 


D0-D31 CPU PORT Access Hold Time 


3 




2 


T29 


PORT Setup Time 


7 




2 


T30 


PORT Hold Time 


3 




2 


T31 


BOFF Setup Time 


10 




2 


T32 


BOFF Hold Time 


3 




2 
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AC Characteristics (Continued) 



82596CA INPUT/OUTPUT SYSTEM TIMINGS 

Tq - 0°C-85°C > Vcc = 5V ±5%. These timing assume the Q_ on all outputs is 50 pF unless otherwise 
specified. C|_ can be 20 pF to 120 pF, however timings must be derated. All timing requirements are given in 
nanoseconds. 



Symbol 


Parameter 


33 MHz 


Notes 


Min 


Max 




Operating Frequency 


12.5 MHz 


33 MHz 


1X CLK Input 


Ti 


CLK Period 


30 


80 




T1a 


CLK Period Stability 




0.1% 


Adjacent CLK A 


T2 


CLK High 


11 




2.0V 


T3 


CLK Low 


11 




0.8V 


T4 


CLK Rise Time 




3 


0.8V to 2.0V 


J5 


CLK Fall Time 




3 


2.0V to 0.8V 


T6 


BEn, LOCK, and A2-A31 Valid Delay 


3 


19 




T6a 


BLAST, PCHK Valid Delay 


3 


22 




17 


BEn, LOCK, BLAST, A2-A31 Float Delay 


3 


20 




T8 


W/R and ADS Valid Delay 


3 


19 




T9 


W/R and ADS Float Delay 


3 


20 




T10 


D0-D31 , DPn Write Data Valid Delay 


3 


19 




T11 


D0-D31 , DPn Write Data Float Delay 


3 


20 




T12 


HOLD Valid Delay 


3 


19 




T13 


CA and BREQ Setup Time 


7 




1.2 


T14 


CA and BREQ Hold Time 


3 




1,2 


T15 


BS16 Setup Time 


6 




2 


T16 


BS16 Hold Time 


3 




2 


T17 


BRDY, RDY Setup Time 


6 




2 


T18 


BRDY, RDY Hold Time 


3 




2 


T19 


D0-D31 , DPn READ Setup Time 


5 




2 


T20 


D0-D31, DPn READ Hold Time 


3 




2 


T21 


AHOLD and HLDA Setup Time 


8 




1,2 


T22 


AHOLD Hold Time 


3 




1.2 
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AC Characteristics (Continued) 

82596CA INPUT/OUTPUT SYSTEM TIMINGS 

C[_ on all outputs is 50 pF unless otherwise specified. 
All timing requirements are given in nanoseconds. 



Symbol 


Parameter 


33 MHz 


Notes 


Min 


Max 


T22a 


HLDA Hold Time 


3 




1,2 


T23 


RESET Setup Time 


8 




1,2 


T24 


RESET Hold Time 


3 




1,2 


T25 


INT/INT Valid Delay 


1 


20 




T26 


CA and BREQ, PORT Pulse Width 


2T1 




1,2,3 


T27 


D0-D31 CPU PORT Access Setup Time 


5 




2 


T28 


D0-D31 CPU PORT Access Hold Time 


3 




2 


T29 


PORT Setup Time 


7 




2 


T30 


PORT Hold Time 


3 




2 


T31 


BOFF Setup Time 


8 




2 


T32 


BOFF Hold Time 


3 




2 



NOTES: 

1. RESET, HLDA, and CA are internally synchronized. This timing is to guarantee recognition at next clock for RESET, HLDA 
and CA. 

2. All set-up, hold and delay timings are at maximum frequency specification Fmax, and must be derated according to the 
following equation for operation at lower frequencies: 

Tderated = (Fmax/Fopr) x T 

where: 

Tderate = Specifies the value to derate the specification. 

Fmax = Maximum operating frequency. 

Fopr = Actual operating frequency. 

T = Specification at maximum frequency. 

This calculation only provides a rough estimate for derating the frequency. For more detailed information, contact your 

Intel Sales Office for the data sheet supplement. 

3. CA pulse width need only be 1 T1 wide if the set up and hold times are met; BREQ must meet setup and hold times and 
need only be 1 T1 wide. 

TRANSMIT/RECEIVE CLOCK PARAMETERS 




Symbol 


Parameter 


20 MHz 


Notes 


Mln 


Max 


T36 


TxC Cycle 


50 




1,3 


T38 


TxC Rise Time 




5 


1 


T39 


TxC Fall Time 




5 


1 


T40 


TxC High Time 


19 




1,3 


T41 


TxC Low Time 


18 




1,3 


T42 


TxD Rise Time 




10 


4 


T43 


TxD Fall Time 




10 


4 


T44 


TxD Transition 


20 




2,4 


T45 


TxC Low to TxD Valid 




25 


4,6 


T46 


TxC Low to TxD Transition 




25 


2,4 


T47 


TxC High to TxD Transition 




25 


2,4 


T48 


TxC Low to TxD High (At End of Transition) 




25 


4 
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TRANSMIT/RECEIVE CLOCK PARAMETERS (Continued) 


Symbol 


Parameter 


20 MHz 


Notes 


Min 


Max 


RTS AND CTS PARAMETERS 


T49 


TxC Low to RTS Low, 
Time to Activate RTS 




25 


5 


T50 


CTS Low to TxC Low, CTS Setup Time 




20 




T51 


TxC Low to CTS Invalid, CTS Hold Time 


10 




7 


T52 


TxC Low to RTS High 




25 


5 


RECEIVE CLOCK PARAMETERS 


T53 


RXCCycle 


50 




1 , 3 


T54 


RXC Rise Time 




5 


1 


T55 


RXCFallTime 




5 


1 


T56 


RXC High Time 


19 




: 1_ ■ 


T57 


RXC Low Time 


18 




1 


RECEIVED DATA PARAMETERS 


T58 


RXD Setup Time 


20 




6 


T59 


RXD Hold Time 


10 




6 


T60 


RXD Rise Time 




10 




T61 


RXD Fall Time 




10 




CRS AND C 


DT PARAMETERS 


T62 


CDTLowtoTXCHIGH 

External Collision Detect Setup Time 


20 






T63 


TXC High to CDT Inactive, CDT Hold Time 


10 






T64 


CDT Low to Jam Start 






10 


T65 


CRSLowtoTXCHigh, 
Carrier Sense Setup Time 


20 






T66 


TXC High to CRS Inactive, CRS Hold Time 
(Internal Collision Detect) 


10 






T67 


CRS High to Jamming Start, 






12 


T68 


Jamming Period 






11 


T69 


CRS High to RXC High, 
CRS Inactive Setup Time 


30 






f 70 


RXC High to CRS High, 
CRS Inactive Hold Time 


10 
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TRANSMIT/RECEIVE CLOCK PARAMETERS (Continued) 



Symbol 


Parameter 


20 MHz 


Notes 


Min 


Max 


INTERFRAME SPACING PARAMETERS 


T71 


Interframe Delay 






9 


EXTERNAL LOOPBACK-PIN PARAMETERS 


T72 






T36 


4 


TXC Low to LPBK Low 


T73 






T36 


4 


TXCLowtoLPBKHigh 



NOTES: 

1.. Special MOS levels. V C il = 0.9V and V C ih = 3.0V. 

2. Manchester only. 

3. Manchester. Needs 50% duty cycle. 

4. 1 TTL load + 50 pF. 

5. 1 TTL load + 100 pF. 

6. NRZ only. 

7. Abnormal end of transmission — CTS expires before RTS. 

8. Normal end to transmission. 

9. Programmable value: 
T71 = N| FS *T36 

where: Nifs = the IFS configuration value 

(if Nifs is less than 12 then Nifs is forced to 12). 

10. Programmable value: 

T64 = (NcpF°T36) + A-OT36 
(If the collision occurs after the preamble) 
where: 

Ncdf = tne collision detect filter configuration value, 
and 

x= 12, 13, 14, or 15 
11.T68 = 32*T36 

1 2. Programmable value: 

T67 = (N C SF*T36) + x»T36 

where: Nqsf = tne Carrier Sense Filter configuration 

value, and 

x= 12, 13, 14, or 15 

13. To guarantee recognition on the next clock. 
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82596CA BUS OPERATION 

The following figures show the 82596CA basic bus cycle and basic burst cycle. 
Please refer to the 32-Bit LAN Component User's Manual. 
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Figure 44. Basic 82596CA Bus Cycle 
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Figure 45. Basic 82596CA Burst Cycle 
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SYSTEM INTERFACE A.C. TIMING CHARACTERISTICS 

The measurements should be done at: 

• T c = 0°C-85°C, V C c = 5V ±10%, C = 50 pF unless otherwise specified. 

© A.C. testing inputs are driven at 2.4V for a logic "1" and 0.45V for a logic "0". 

o Timing measurements are made at 1.5V for both logic "1" and "0". 

o Rise and Fall time of inputs and outputs signals are measured between 0.8V and 2.0V respectively unless 
otherwise specified. 

• All timings are relative to CLK crossing the 1.5V level. 

o All A.C. parameters are valid only after 100 jas from power up. 



2.4V 
0.45V 



3G 



5V Test Point 



3C 



290218-18 
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CLK 









Figure 46. CLK Timings 

Two types of timing specifications are presented below: 

1. Input Timing — minimum setup and hold times. 

2. Output Timings — output delays and float times from CLK rising edge. 

Figure 47 defines how the measurements should be done: 



1.5V 



LEGEND: 

Ts = Input Setup Time 

Th = Input Hold Time 

Tn = Minimum output delay or Mininum float delay 

Tx = Maximum output delay or Maximum float delay 




Figure 47. Drive Levels and Measurements Points for A.C. Specifications 

Ts = T1 3, T1 5, T1 7, T1 9, T21 , T23, T27, T29, T31 
Th = T14, T16, T18, T20, T22, T22a, T24, T28, T30, T32 
Tn = T6, T6a, T7, T8, T9, T1 0, T1 1 , T1 2, T25 
Tx = T6, T6a, T7, T8, T9, T1 0, T1 1 , T1 2, T25 
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INPUT WAVEFORMS 




Figure 48. CA and BREQ Input Timing 




Figure 49. INT/INT Output Timing 



CLK 



HOLD 



BOFF 

AHOLD 

HLDA 



T12-*- *- 



x 



.E 



T31 



T12H 
T22\- 



T22 

— T22a-H 

T32 



3, 



v ! y 



Figure 50. HOLD/HLDA Timings 
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Figure 51. Input Setup and Hold Time 
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Figure 52. Output Valid Delay Timing 
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Figure 53. Output Float Delay Timing 
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Figure 54. PORT Setup and Hold Time 
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Figure 55. RESET Input Timing 
SERIAL AC TIMING CHARACTERISTICS 
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Figure 56. Serial Input Clock Timing 



TXZ f~* 
















RTS 


/ 


i 






CTS 








f 








CDT \ 


♦T62* 




T63 


- 


r 




















T66 




♦T65- 


CRS 




r 


— TK7- 


\ 






, 




J 












h — T64 — »4* — 


T68 — 


H, 






(NRZ) ,. 

-HT4> 
TXD mm% ' ,. 




9\ 










> 


t 


»h 




%/ 
/% 








(MANCHESTER) ../*»./*%.** 


9 
9 








290218-30 





















Figure 57. Transmit Data Waveforms 
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Figure 58. Transmit Data Waveforms 
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Figure 59. Receive Data Waveforms (NRZ) 
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Figure 60. Receive Data Waveforms (CRS) 
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Family: Ceramic Pin Grid Array Package 


Symbol 


Millimeters 


Inches 


Min 


Max 


Notes 


Min 


Max 


Notes 


A 


3.56 


4.57 




0.140 


0.180 




A| 


0.76 


1.27 


Solid Lid 


0.030 


0.050 


Solid Lid 


'A 2 


2.67 


3.43 


Solid Lid 


0.105 


0.135 


Solid Lid 


A 3 


1.14 


1.40 




0.045 


0.055 




B 


0.43 


0.51 




0.017 


0.020 




D 


36.45 


37.21 




1.435 


1.465 




Di 


32.89 


33.15 




1.295 


1.305 




ei 


2.29 


2.79 




0.090 


0.110 




L 


2.54 


3.30 




0.100 


0.130 




N 


132 


132 


Si 


1.27 


2.54 




0.050 


0.100 
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12/88 







4-126 



82596CA 



PGailLOMNIAGW 



Intel Case Outline Drawings 

Plastic Quad Flat Pack (PQFP) 

0.025 Inch (0.635mm) Pitch 



Symbol 


Description 


Min 


Max 


Min 


Max 


Min 


Max 


Min 


Max 


Min 


Max 


Min 


Max 


N 


Leadcount 


68 


84 


100 


132 


164 


196 


A 


Package Height 


0.160 


0.170 


0.160 


0.170 


0.160 


0.170 


0.160 


0.170 


0.160 


0.170 


0.160 


0.170 


A1 


Standoff 


0.020 


0.030 


0.020 


0.030 


0.020 


0.030 


0.020 


0.030 


0.020 


0.030 


0.020 


0.030 


D, E 


Terminal Dimension 


0.675 


0.685 


0.775 


0.785 


0.875 


0.885 


1.075 


1.085 


1.275 


1.285 


1.475 


1.485 


D1.E1 


Package Body 


0.547 


0.553 


0.647 


0.653 


0.747 


0.753 


0.947 


0.953 


1.147 


1.153 


1.347 


1.353 


D2.E2 


Bumper Distance 


0.697 


0.703 


0.797 


0.803 


0.897 


0.903 


1.097 


1.103 


1.297 


1.303 


1.497 


1.503 


D3.E3 


Lead Dimension 


0.400 REF 


0.500 REF 


0.600 REF 


0.800 REF 


1.000 REF 


1.200 REF 


D4, E4 


Foot Radius Location 


0.623 


0.637 


0.723 


0.737 


0.823 


0.837 


1.023 


1.037 


1.223 


1.237 


1.423 


1.437 


L1 


Foot Length 


0.020 


0.030 


0.020 


0.030 


0.020 


0.030 


0.020 


0.030 


0.020 


0.030 


0.020 


0.030 


Issue 
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Symbol 


Description 


Min 


Max 


Min 


Max 


Min 


Max 


Min 


Max 


Min 


Max 


Min 


Max 


N 


Leadcount 


68 


84 


100 


132 


164 


196 


A 


Package Height 


4.06 


4.32 


4.06 


4.32 


4.06 


4.32 


4.06 


4.32 


4.06 


4.32 


4.06 


4.32 


A1 


Standoff 


0.51 


0.76 


0.51 


0.76 


0.51 


0.76 


0.51 


0.76 


0.51 


0.76 


0.51 


0.76 


D, E 


Terminal Dimension 


17.15 


17.40 


19.69 


19.94 


22.23 


22.48 


27.31 


27.56 


32.39 


32.64 


37.47 


37.72 


D1.E1 


Package Body 


13.89 


14.05 


16.43 


16.59 


18.97 


19.13 


24.05 


24.21 


29.13 


29.29 


34.21 


34.37 


D2, E2 


Bumper Distance 


17.70 


17.85 


20.24 


20.39 


22.78 


22.93 


27.86 


28.01 


32.94 


33.09 


38.02 


38.18 


D3.E3 


Lead Dimension 


10.16 REF 


12.70 REF 


15.24 REF 


20.32 REF 


25.40 REF 


30.48 REF 


D4, E4 


Foot Radius Location 


15.82 


16.17 


.18.36 


18.71 


21.25 


21.25 


25.89 


26.33 


31.06 


31.41 


36.14 


36.49 


L1 


Foot Length 


0.51 


0.76 


0.51 


0.76 


0.51 


0.76 


0.51 


0.76 


0.51 


0.76 


0.51 


0.76 
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Figure 61. Principal Dimensions and Datums 
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Figure 62. Molded Details 
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Figure 63. Terminal Details 
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Figure 64. Typical Lead 
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COMPREHENSIVE SOFTWARE DEBUG SUPPORT FOR i960TM 
EMBEDDED APPLICATIONS 

Intel provides comprehensive software debug support for all members of the i960TM 
component architecture, including the newest members, the i960SA and i960SB. All 
Intel's i960 software debug products share the same high-level, windowed user interface 
emerging as the standard for all i960 tools from Intel. This innovative debug interface 
allows users to focus their efforts on finding bugs rather than spending time learning and 
manipulating the debug environment. 

Intel's i960 software debug tools support a wide variety of debug environments, including 
code debug on a simulated target environment, a PC-based evaluation board, a serial- 
based Intel evaluation board, or a serial-based, customized target system. 

GENERAL i960 SOFTWARE DEBUGGER FEATURES 



° Windowed, pull down menu user 

interface shared by other i960 

Development Tools 
° Full symbolic debug with source level 

display allows C or assembly code 

debugging 
° Debugging productivity enhanced by 

ability to quickly browse source code and 

view call stacks or symbol run-time 

values 



Breakpoints may be defined symbolically 
using module names, procedure names 
and line numbers 

Single step execution, code assembly/ 
disassembly, memory and register 
display/ modification 
Run-time library support allows 
programs to access host files and perform 
I/O 



"IBM, PC/AT, and Personal System/2 are registered trademarks of International Business Machines Corporation. 

* Compaq is a registered trademark of the Compaq Corporation. 

* Intel is a registered trademark of the Intel Corporation. 
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FEATURES 



EASY TO USE, POWERFUL 
USER INTERFACE 

All i960 debuggers share the same high-level, 
powerful user interface as other i960 
development tools. Utilizing pulldown menus, 
users have access to a color, windowed 
environment featuring source-level, symbolic 
debugging. Multiple, non-overlapping windows 
can be used to display source code, registers, 
variable values, and command line entries. 

DEBUGGING FEATURES 

High-level source or disassembled code can be 
displayed in the source window. Users can 
scroll through the source, browse from module 
to module in a program, scope to any 
executable point in the source, or 
instantaneously relocate from a symbol name 
to the location where it was defined 
(hyperscope operation). Symbol names in the 
source can be highlighted to inspect the 
current run-time value of program variables. 
Call stacks can be examined to trace execution 
flow. 

A variety of breakpoints can be specified 
including source breakpoints, watch points, 
passpoints, or event-action breakpoints. 
Breakpoints can be defined symbolically using 
module names, procedure names and line 
numbers. Watch points allow users to observe 
a variable as it changes during program 
execution. Passpoints display a message when 
a specified instruction is executed, giving the 
user a non-realtime way to track execution of 
key code sequences without halting instruction 
flow. The event-action form allows complex 
breakpoint conditions to be set up, including 
data breakpoints (when supported by on-chip 
registers). 

Users can step through program execution via 
a single assembly language instruction, a high- 
level language statement or a high-level 
function or procedure. Memory can be 
displayed or modified as common data types 
and all processor registers and system tables 
can be examined or changed. 

Expressions involving symbol names, memory 
references, or both, can be defined as watch 
expressions whose values are monitored in a 
Watch window as a program executes. The 
i960 family of software debuggers also allows 
screen flipping between the debugger 
environment and the display output from the 
program. 



Low level, run time libraries are provided that 
allow programs running on an i960 board to 
access the file system on the host or to perform 
I/O operations. 

RETARGETABLE SOFTWARE 
DEBUGGER 

Intel's DB-960 Retargetable Software 
Debugger is a combination application and 
system level debugger designed for use with 
the i960 family of embedded microprocessors. 
DB-960's retargetable monitor can be 
customized to a target system, allowing source- 
level, symbolic debug across a serial interface 
cable. 

RETARGETABLE MONITOR 

Utilizing a combination of object files and 
source code, a retargetable monitor is provided 
with DB-960 for users to customize and 
incorporate into their proprietary target 
systems. This retargetable monitor is designed 
to support all members of the i960 family. Most 
of the monitor code is provided in object code 
and does not need to be changed. Hardware- 
dependent source code is supplied for 
modification by users. Example code is 
provided for porting the monitor to the Intel 
EV80960CA and QT960 target boards. Both 
boards use an Intel 82510 UART serial 
controller chip and the Intel 82C54 Counter/ 
Timer. 

HARDWARE DEBUG 

DB-960 takes advantage of on-chip debug 
registers like those found on the i960CA to 
provide two hardware execution address 
breakpoints and two data address breakpoints. 
Once the monitor has been retargeted to the 
target system, hardware designers can 
download initialization code, read/ write to 
registers and examine memory or register 
contents. 

HIGH SPEED SERIAL LINK 

DB-960 communications between the host and 
target system is supported via RS232 and 
RS422 communication links. RS232 allows 
access to industry standard serial protocols 
while the RS422 interface provides higher 
speed communication (up to 115K baud) for 
faster code and data download. PC- AT bus- 
compatible RS422 communication boards are 
available from various third party vendors. 
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FEATURES 



CUSTOMIZED ENVIRONMENT 

Because the user has control over the target 
board and serial driver source code, a highly 
customized target environment can be 
developed. Serial communication functions can 
be modified to allow for parallel 
communication schemes, allowing faster 
download speeds. 

LICENSING 

There are no incorporation or royalty fees for 
customers shipping the retargeted DB-960 
monitor with their product or system. 

PC-BASED SOFTWARE 
DEBUGGER 

The DB960KBDEVA Software Debugger is 
designed for debugging i960KA or i960KB code 
executing on an Intel EVA-960KB4MB 
Software Execution Vehicle plugged into PC- 
ATs or compatibles using DOS. 
DB960KBDEVA offers the same powerful 
debug user interface as other i960 softerware 
debuggers and utilizes I/O resources provided 
by the PC. Due to compatibility with the 
i960KA and i960KB, i960SA and i960SB code 
can be executed and debugged using the Intel 
EVA-960KB4MB Software Execution Vehicle 
in conjunction with the DB960KBDEVA 
Software Debugger. 

SIMULATOR-BASED 
SOFTWARE DEBUGGER 

The DBSIM960 Debug Simulator combines an 
i960 CA/KA/SA instruction-level simulator 
with the easy to use, powerful DB960 software 
debugger interface. Users can debug i960 
applications without a hardware target system 
being available, allowing products to get to 
market sooner. For i960 CA designs, 
performance information is provided, with 
timing profiles accurate to plus or minus 5%. 

Users can specify the target system's clock 
speed and wait-state information for each 
region of memory.* DBSIM960 uses this 
information to provide i960 CA performance 
statistics. DBSIM960 expects COFF executable 
files generated by Intel's CTOOLS960 compiler 
and assembler. Execution flow can be 
monitored by using a trace capability, which 
reports the 8 digit cycle address, 8 digit 
instruction pointer value, and the 
disassembled instruction for each operation. 



Program execution statistics reported 
include: 

° Total number of instructions executed 

° Total time 

° Number of times a call caused processor to 

write registers to external memory 
° Current clock setting in cycles per second 
° Current wait-state setting for each of the 16 

memory regions 
° Number of instruction words executed from 

cache rather than external memory 
° Total number of cycles elapsed 
° Number of stack frames or register sets 

cached on chip 
° Number of times an unaligned load or store 

operation occurred 
° Bus utilization 
° Branch prediction efficiency 
° Usage for load, store, call and branch cache 

instructions 

Generally, DBSIM960 provides all the full 
symbolic, debug capabilities found in the i960 
family of debug tools, while providing a 
complete benchmarking environment prior to 
target system availability. 

*By being able to easily change the waitstate definition for 
their code, the user's hardware and software design can be 
optimized before any hardware development takes place. 

IN-CIRCUIT DEBUG MONITOR 

Intel's DB960CADIC in-circuit debug monitor 
hosted on extended DOS/ 386 allows users to 
debug high-speed, cached applications at the 
full speed of the i960CA target processor. 
DB960CADIC can be used by both hardware 
and software developers, at any stage of design. 
Early in the development process, 
DB960CADIC allows software debugging when 
inserted into an existing i960CA board such as 
the EV80960CA, or in the DB960CASAST 
stand-alone self-test unit. Later in the design 
cycle, DB960CADIC can be inserted into the 
user's target system, facilitating debug of 
hardware/software integration. 

DB960CADIC offers the same, windowed debug 
user interface as other i960 software debuggers 
and is also available with an optional 4 MB 
standalone self test chassis to debug and test 
code before prototype hardware is available. 
For further information, see fact sheet 
#280900 from Intel. 
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FEATURES 



SOFTWARE COMPLETES THE 
SYSTEM 

Intel provides a comprehensive software 
development environment to complement DB- 
960. This environment includes a C Compiler, 
an i960 Assembler, a system generator for 
automating the compilation process and 
instruction-level simulators. The languages 
support the entire range of i960 embedded 
processors. 

WORLDWIDE SERVICE, 
SUPPORT, AND TRAINING 

To augment its development tools, Intel offers 
a full array of seminars, classes, workshops, 
field application engineering expertise, hotline 
technical support, and on-site service. 



Intel also offers a Software Support Contract 
which includes technical software information, 
automatic distributions of software and 
documentation updates, (COMMENTS 
publication, remote diagnostic software, and a 
development tools troubleshooting guide. 

Intel's 90 7 day Hardware Support package 
includes technical hardware information, 
telephone support, warranty on parts, labor, 
material, and on-site hardware support. 

Intel Development Tools also offers a 30-day, 
money-back guarantee to customers who are 
not satisfied after purchasing any Intel 
development tool. 



SPECIFICATIONS AND REQUIREMENTS 



HOST SYSTEM REQUIREMENTS 

Host system requirements to run Intel's i960 
family of software debuggers include the. 
following: 

• DOS version 3.3 or later excluding DOS 4.0 

• 640K bytes of RAM in conventional memory 

• A fixed disk drive with at least 1.25M bytes 
of free disk space 

• One disk drive capable of reading 5.25 inch, 
360K byte disks 

• RS232 serial port (COM1 or COM2) 



Evaluated Systems include: 

IBM PC-AT* with DOS 3.3 

COMPAQ 386* with DOS 3.3 

Intel 30y 3 02* with DOS 3.3 

IBM Personal System/2* Model 70/80 with 

DOS4.01 
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ORDERING INFORMATION 



DB960KBDEV DOS-based, retargetable 
software debugger for the 
960KA, i960KB, i960SA, 
i960SB and i960CA 
embedded microprocessors. 
Includes host debug 
software, retargetable 
monitor, host I/O libraries 
and documentation. 

DB960KBDEVA DOS-based source level 
debugger for the i960KA, 
i960KB, i960SA and i960SB 
embedded microprocessors. 
Requires EVA-960KB4MB 
Software Execution Vehicle 
and PC- AT compatible bus. 

DBSIM960D DOS/386-hosted debug 

simulator for the i960 CA, 
i960 KA and i960 SA which 
utilizes an i960 CA 
instruction-level simulator 
allowing code development 
and debug prior to hardware 
prototype availability. 

DBSIM960S UNIX System V/386-hosted 

debug simulator for the i960 
CA, i960 KA and i960 SA 
which utilizes an i960 CA 
instruction-level simulator 
allowing code development 
and debug prior to hardware 
prototype availability. 



DB960CADIC 



DBSIM960R IBM RS/6000-hosted debug 

simulator for the i960 CA, 
i960 KA and i960 SA which 
utilizes an i960 CA 
instruction-level simulator 
allowing code development 
and debug prior to hardware 
prototype availability. 

DOS/386 hosted in-circuit 
debug monitor for i960CA 
only. Includes small board 
with i960CA processor, 
system debug monitor and 
serial interface. Plugs into 
i960CA socket on hardware 
prototype system. 

Standalone Self Test Unit for 
DB960CADIC. Includes built- 
in power supply, self-test 
board, 4M byte of usable 
DRAM for code development 
and enclosure. 

To order your Intel Development Tool product, 
for more information, or for the number of 
your nearest sales office or distributor, call 
800-874-6835 (North America). For literature 
on other Intel products call 800-548-4725 
(North America). Outside of North America, 
please contact your local Intel sales office or 
distributor for more information. 



DB960CASAST 
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EXV-960MC EXECUTION VEHICLE 




80960MC-BASED TARGET SYSTEM SUPPORTING EARLY 
SOFTWARE DEVELOPMENT AND BENCHMARKING 

EXV-960MC is a software execution vehicle designed to support 80960MC-based designs. 
Users can use the EXV-960MC board to execute and debug their application software 
before a functional hardware prototype is available. The EXV-960MC is also designed 
with programmable waitstate SRAM to support benchmarking activities. The EXV- 
960MC is supported by the complete set of Intel C, assembler and Ada code generation 
tools. Both of the VAX/VMS*-hosted 80960MC software debuggers, the SDM-960MC 
system debug monitor and the Ada-960MC source-level debugger, can be used for 
debugging software running on the EXV-960MC. 

EXV-960MC includes a Multibus I form factor board and a set of SDM-960MC target 
monitor EPROMS. The SDM-960MC and the Ada-960MC debugger are preconfigured to 
support the EXV-960MC execution environment. Designers can select the software 
debugger best suited to their development needs. The Ada-960MC debugger is a source- 
level symbolic debugger which provides a productive debugging environment for Ada 
applications. The SDM-960MC debug monitor offers a complete debugging facility for 
applications written in C, assembler or Ada. 



* VAX/VMS is a trademark of Digital Equipment Corp. 
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SBM-960MC RETARGETABLE SYSTEM DEBUG 

MONITOR 



FEATURES 

o 25 MHz 80960MC processor 

° 256 Kbytes of (0,0,0,0) programmable wait-state SRAM 

° 4 Mbytes dual-ported (3,1,1,1) wait-state DRAM 

o iSBXTM interface 

° Two serial ports, one bi-directional parallel port 

° 8254 programmable interval timer 

° 8259A programmable interrupt controller 

ELECTRICAL CHARACTERISTICS 

10 A @ +5V 
50mA @ + 12V 
50mA @ -12V 

ENVIRONMENTAL CHARACTERISTICS 

Operating temperature: 0° to + 60°C (32° to 140°F), 300 LFM 
Operating Humidity: 10% to 90% non-condensing 

SOFTWARE DEBUGGING SUPPORT 

The SDM-960MC is a VAX/VMS*-hosted system debug monitor that provides a complete, flexible 
environment to execute and debug 80960MC-based applications. Users can tailor the execution 
environment as software development evolves. Initially, the application may require the full 
support of the system debug monitor to establish a run-time environment. As the application 
evolves, the SDM-960MC allows the application to take more of the responsibility for system 
functions. 

The default execution environment of the SDM-960MC is the EXV-960MC execution vehicle. The 
VAX-hosted portion of the SDM-960MC debug monitor provides complete on-target debugging 
support through its interface with the target-resident portion of the SDM-960MC. To facilitate 
debugging on a user's custom target system, the SDM-960MC includes source and object files 
necessary to reconfigure the target monitor. SDM-960MC and other 80960MC development tools 
allow the developers to take full advantage of the 80960MC processor. 




FEATURES 

° assemble and disassemble 80960MC 

instructions 
° single step program execution 
° access to memory and processor resources 

• support 64 execution breakpoints 

• issue Interagent Communications (IACs) 

• powerful execution trace 

• serial download 

HARDWARE REQUIREMENTS 

• a serial interface 

° 25 Kbytes of EPROM 

• contiguous 50 Kbytes of RAM 



WORLDWIDE SERVICE AND 
SUPPORT 

Intel augments its 80960 architecture family 
development tools with a full array of 
seminars, classes, and workshops; on-site 
consulting services and telephone support are 
available at all stages of development. 

ORDERING INFORMATION 

Product Code Description 

EXV960MC 80960MC execution vehicle 
(board and target EPROM) 

SDM960MC VAX, MicroVAX/VMS 
hosted System Debug 
Monitor, retargetable source 
is included 
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80960SA/SB DEVELOPMENT SUPPORT 




COMPREHENSIVE DEVELOPMENT SUPPORT FOR 80960SA/ 
SB EMBEDDED APPLICA TIONS 

Intel provides comprehensive development support for the 80960 component 
architecture, including the newest members, the 80960SA and 80960SB. Tools range from 
compilers to simulators and from debuggers to emulators. All designed specifically for 
members of the 80960 family, allowing you to take full advantage of their RISC-based 
design while reducing time to market. 

DEVELOPMENT TOOLS AVAILABLE: 



ASM-960 macro assembler for 
developing and tuning speed-critical code 
iC-960 highly optimizing C language 
compiler for high-level language 
software development 
GEN-960 system generator for 
initializing your design to take 
advantage of 80960 on-chip features 
DB/SIM960KA debug simulator for 
80960KA and 80960SA applications 



Windowed, interactive, source-level DB- 

960 debugger which can be targeted to 

one of the evaluation and development 

boards below, or customized to your 

target system 

Evaluation and development boards 

including the EV960SB, the QT80960KB, 

andtheEVA960KB 

ICE-960SA/SB offers a full featured in- 

circuit emulator for the 80960SA/SB 

components 
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ASM-960 MACRO ASSEMBLER 

The ASM-960 macro assembler is used to fine- 
tune sections of code for peak program 
execution speed on the 80960SA, 80960SB, 
80960KA, 80960KB, 80960MC, and 80960CA. 
ASM-960 does this by giving you absolute 
control over program instructions. In addition 
to the assembler and macro preprocessor, 
ASM-960 includes several utilities for 
application program maintenance and debug: 
<» LINKER provides incremental program 
linking/locating and link-time optimization. 

• ARCHIVER allows you to build reusable 
function libraries for applications. 

o DISASSEMBLER produces assembly 

language from object files, 
o SYMBOL DUMPER provides symbolic 

information from a program file for 

facilitating low-level debug. 

• ROM IMAGE BUILDER produces a hex file 
suitable for PROM programmers. 

• Macro preprocessor provides code generation 
flexibility and improves code readability, 
reducing maintenance costs. 

A Floating Point Arithmetic Library (FPAL) is 
included for the 80960SA, 80960KA, and 
80960CA components. It eliminates the need to 
develop your own floating point code. 



GEN-960 SYSTEM GENERATOR 

The 80960 System Generator (GEN-960) helps 
you set up data structures for standalone, 
embedded applications that use the on-chip 
features of the 80960 architecture. GEN-960 is 
used with other 80960 tools to generate and 
refine ROM or RAM code. GEN-960 supplies a 
set of command and template files containing 
assembly code and linker control commands to 
set up processor control blocks, inter-agent 
communication mechanisms, system procedure 
tables, and other requirements for 
initialization. The result is a batch file 
containing all the commands needed to 
compile, assemble and link the final target 
system. 
° Improves engineering productivity by 

automating the compilation, assembly and 

linking process 
° Supplies sample initialization code, reducing 

programming time 
• Save engineering time by simplifying the 

task of initializing each processor for on-chip 

capabilities 
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iC-960 COMPILER 

iC-960 is a highly optimizing C language 
compiler for the 80960 family of 
microprocessors. iC-960 supports the full C 
language as described in the Kernighan and 
Ritchie book, The C Programming Language 
(Prentice-Hall, 1978). iC-960 includes standard 
ANSI extensions to the C language and is used 
in conjunction with ASM-960 for creating 
object files. 

The iC-960 compiler supports a number of 
processor dependent optimizations including 
global register allocation, constant 
propagation, arithmetic identity folding, 
redundant load/ store elimination, strength 
reduction and register allocation/scheduling of 
arguments. Processor independent 
optimizations include common sub-expression 
elimination, folding of constant expressions, 
elimination of superfluous branches, removing 
unreachable code, tail recursion and procedure 
incorporation. 

iC-960 includes a standard C library with I/O 
functions and mathematical routines. A second 
library provides low level, environment- 
dependent routines emulating UNIX* system 
calls and supplies I/O routines for the EVA- 
960 Software Execution Vehicle. 

iC-960 also includes the following 
enhancements for embedded application 
development: 

Programs may be easily placed in ROM. 
Memory-mapped I/O allows high-level 
language access to application-specific input 
and output. 

In-line assembly simplifies the integration of 
C language and assembly code for speed- 
critical functions. 

Floating point support produces in-line code 
to take full advantage of the floating point 
capability of the 80960SB, 80960KB and 
80960MC. 

Symbolic debugging of source code for iC-960 
and ASM-960 is provided by the DB-960 Source 
Level Debugger, the DBSIM960KA debugging 
simulator, the DB960CADIC in-target 
debugger, and the ICE960SB and ICE960KB 
emulators. 



DEBUGGING SIMULATOR 

The DBSIM960KA simulator features an easy 
to use, pulldown menu user interface combined 
with an 80960SA/80960KA instruction 
simulator. DBSIM960KA facilitates debugging 
80960SA and 80960KA applications by 
providing debugging capabilities before target 
hardware is available. DBSIM960KA's 
powerful, windowed, source-oriented interface 
allows you to focus your efforts on finding bugs 
rather than on learning and manipulating the 
debug environment. 

Ease of learning. Drop-down menus make the 
debugger easy to learn for new or casual users. 
A command line interface allows direct 
command entry for solving more complex 
problems, improving productivity of 
knowledgeable users. 

Extensive debug modes. You can set 

conditional breakpoints, pass points, and 
temporary breakpoints as needed. 

See into your program. Using pull-down 
menus or function keys, you can browse source 
and Call stacks, monitor processor registers, 
view screen output, and watch the values of 
variables change. 

Full debug symbolics for maximum 
productivity. You need not know whether a 
variable is an unsigned integer, a real, or a 
structure: the debugger displays program 
variables in their respective type formats. 
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EVA-960KB4MB SOFTWARE 
EXECUTION VEHICLE 

The EVA-960KB4MB is a software execution 
vehicle for the 80960KA/KB microprocessor. It 
is a single PC AT plug-in board which provides 
easy and convenient architecture evaluation 
and benchmarking, as well as software 
development. Since the board uses an 
80960KB, 80960SA and 80960SB performance 
can be extrapolated. The EVA-960KB4MB 
contains the following: 

• 4 MB or 16 MB (EVA960KB16MB) of one 
wait-state program memory (DRAM) 

• 64 Kbytes of zero wait-state program 
memory (SRAM) 

° Three-channel programmable interval timer 



• Hosted debug monitor which supports two 
hardware and 64 software breakpoints, 
single-step program execution, register and 
memory access, program download and 
upload 

• DOS access libraries that allow: screen 
display, keyboard input, read and write disk 
files, and the ability to spawn a DOS process 
that could communicate with serial or 
parallel I/O 

• 20 MHz operation, allowing software to 
operate at full speed of 80960KB 

EVA-960KB4MB also operates with the DB- 
960 Source Level Debugger for code 
development/ debug prior to target system 
availability. 



SOURCE-LEVEL DEBUGGER 

The DB-960 Debugger with source-level debug 
capabilities is available for PC.ATs equipped 
with DOS. DB-960 can debug 80960 code 
executing on an Intel EVA-960 Software 
Execution Vehicle or on a hardware target 
system via a serial interface. The EVA-960 
targeted debugger uses I/O resources provided 
by the PC, while 80960 code executes at high 
speed on the EVA-960. Two serial versions of 
DB-960 are available. DB-960CADIC plugs 
directly into the 80960CA socket on your 
prototype, offering a "plug-in and go" debug 
environment. DB-960D is a serial, retargetable 
version of DB-960 whose system debug monitor 
can be customized for 80960SA/SB, 80960KA/ 
KB, or 80960CA operation. 

Ease of learning. Drop-down menus make the 
debugger easy to learn for new or casual users. 
A command line interface allows direct 
command entry for solving more complex 
problems, improving productivity of 
knowledgeable users. 



Extensive debug modes. You can set 

conditional breakpoints, pass points, and 
temporary breakpoints as needed. 

See into your program. Using pull-down 
menus or function keys, you can browse source 
and Call stacks, monitor processor registers, 
view screen output, and watch the values of 
variables change. 

Full debug symbolics for maximum 
productivity. You need not know whether a 
variable is an unsigned integer, a real, or a 
structure: the debugger displays program 
variables in their respective type formats. 

In-Target Debug. Porting the DB960D 
retargetable monitor to your target system 
allows the debugger to be used in-target, thus 
facilitating debugging of code dependent upon 
hardware interaction. 
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ICE960SB IN-CIRCUIT 
EMULATOR 

ICE960SB is a full featured in-circuit emulator 
for the 80960SA and 80960SB components. A 
separate ICE probe can be purchased to 
support 80960KA and 80960KB components. 
ICE960SB includes: 

Full speed emulation of the 80960SA/SB 
components to 16 MHz 

• Complete symbolic information when used 
with Intel 80960 compilers 

• 1024 Frames Bus or Execution Trace with 
Time-Tags 

• Comprehensive break capabilities including 
execution addresses, instruction type, bus 
read/write/access, data values, and external 
synch lines 



• Qualification of break conditions based on a 
8-state machine or an occurrence counter 

° Fastbreaks to dynamically access memory or 
variables during emulation 

• Examine and modify memory and 80960 
registers 

• Stand-Alone Self-Test module provides 
diagnostic circuitry and 256 Kbytes of 
memory for software development 

• Optional 2 Mbyte of relocatable expansion 
memory 

° Support for socketed and surface mounted 84 
Pin PLCC components and surface mounted 
80 Pin EIAJ components via ONCE mode 

• DOS Hosting with support for RS232 and 
RS422 communication links 



WORLDWIDE SERVICE, 
SUPPORT, AND TRAINING 

To augment its development tools, Intel offers 
a full array of seminars, classes, and 
workshops, field application engineering 
expertise, hotline technical support, and on- 
site service. 

Intel also offers a Software Support package 
which includes technical software information, 



telephone support, automatic distribution of 
software and documentation updates, access to 
the "ToolTalk' ' electronic bulletin board , 
"iComments" publication, remote diagnostic 
software, and a development tools 
troubleshooting guide. 

Intel's Hardware Support package includes 
technical hardware information, telephone 
support, warranty on parts, labor, material, 
and on-site hardware support. 
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80960SA/SB DEVELOPMENT 
TOOLS 



DB960D 



ASM960 



C960 



GEN960 



DBSIM960KA 



DB960KBDEVA 



Assembler package 
containing the assembler, 
linker/loader, macro 
preprocessor, archiver, 
ROM image builder, other 
object file utilities, and the 
8096GSA/KA/CA floating 
point arithmetic library. 

Optimizing C Compiler, 
with ANSI extensions for 
embedded control 
applications; contains 
standard STDIO libraries 
and in-line assembly 
capability. 

80960 System Generation 
software automates the 
compilation, assembly and 
linking process. Simplifies 
usage of 80960 sophisticated 
features. 

Debugging Simulator 
software emulates the 
80960SA and 80960KA 
instruction set allowing 
code development and 
debugging prior to 
hardware prototype 
availability. 

Source Level Debugger 
software for the 80960KB/ 
KA with powerful debug 
capabilities including 
conditional breakpoints, 
source and Call stack 
browsing, memory/ register 
display and modification, 
and ability to watch 
variables change value. 
Requires EVA-960KB4MB 
Software Execution 
Vehicle. For PC AT hosted 
systems only. 



EVA960KB4MB 



EVA960KB16MB 



ICE960SB 



Source Level Debugger 
software for 80960SA/SB, 
80960KA/KB, or CA 
processors resident on 
serially-interfaced 
hardware prototype 
systems. Includes 
customizable system debug 
monitor and serial interface 
protocol specifications. For 
PC AT hosted systems only. 

Software Execution Vehicle 
for 80960SA/SB and 
80960KA/KB components. 
Includes 4 Mbyte of on- 
board memory, system 
debug monitor and code 
download software. Code 
compatible with the 
80960SA/SB components. 
Required by 
DB960KBDEVA. 

Identical to 
EVA960KB4MB with 
16 Mbyte of DRAM instead 
of 4 Mbyte. 

In-Circuit emulator for the 
80960SA/SB components. 
Includes ICE base and 
probe, stand-alone self-test 
module, and your choice of 
PLCC or PQFP target 
adapters. Optional 2 Mbyte 
relocatable expansion 
memory option provides 
overlayable memory for 
software prototyping and 
hardware debugging. 
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ARCHITECTURE EVALUATION 
STARTER KITS 

960SKit3 Contains ASM960D Assembler 
and iC960D Compiler 

DB960KIT2 Kit contains DB-960KBDEVA 
(KB version of DB-960 used with 
EVA-960), EVA960KB4MB 
Software Execution Vehicle, 
ASM960D and C960E. Requires 
PC AT with 640K memory. 



DB960KIT3 Kit contains DB-960D (serial 

version of DB-960 supporting the 
80960SA/SB, 80960KA/KB and 
80960CA components (operating 
on PC-AT/DOS), ASM960D and 
C960D. Requires PC AT with 
640K memory. 



Product Code to order, by Host 


Product 


PC-AT/DOS 


UNIX-386 


OS/2 


Sun 3/ 


HP9000/ 


VAX/ 


jaVAX/ 


Category 




V.4 




UNIX 


HP-UX 


ULTRIX 


ULTRIX 


Assembler 


ASM960D 


ASM960S 


ASM960P 


ASM960U 


ASM960H 


ASM960VX 


ASM960MX 


C Compiler 


C960D 


C960S 


C960P 


C960U 


CP60H 


C960VX 


C960VX 


System Gen 


GEN960D 


— 


— 


GEN960U 


GEN960H 


— 


— 


SX Debugger 


DB960D 


— 


— 


— 


— 


— 


— 


KX Debugger 


DB960KBDEVA 


— 


— 


— 


— 


— 


— 




DB960D 


— 


— 


— 


— 


— 


— 


CA Debugger 


DB960CADIC 


— 


— 


— 


— ' 


— 


— 




DB960D 


— 


— 


— 


— 


— 


— 


SA Simulator 


— 


DBSIM960KAS 


— 


— 


— 


— 


— 


CA Simulator 


SIM960CAD 


— 


— 


SIM960CAU 


SIM960CAH 


— ■ 


• — 


ICE960SB 


ICE960SB 


— 


— 


— 


— 


— 


— 


ICE960KB 


ICE960KB 


- 


- 


- 


- 


- 


- 
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INTERCHANGEABLE PROBES 

The ICE T M -960 in-circuit emulator delivers real-time hardware and software debugging 
capabilities for i960TM SA/SB and i960 KA/KB-based designs. Features include full- 
speed emulation of each of the microprocessors, powerful breakpoint specification, 
fastbreaks, optional relocatable expansion memory, two types of trace capability, large 
trace buffering, sophisticated human interface and high-speed communication links with 
the DOS host. The ICE-960 in-circuit emulator gives you unmatched control over all 
phases of hardware/software debug, including developing, integrating and testing, which 
improves development productivity and improves time to market. 



FEATURES 

• Real-Time Emulation of the i960 KA/KB 
microprocessors up to 25 MHz and 
emulation of the i960 SA/SB to 16 MHz 

° Full symbolic integration with Intel 
ASM and C compilers 

• Optional ICE960KBREM/ 
ICE960SBREM boards provide 2 Mbytes 
of ICE memory which can overlay user 
ROM or RAM. 

• Examine and modify memory and the 
i960 registers 



• Dynamically monitor and update 
program variables via fastbreaks 

• Breakpoint capabilities include: 
execution address, instruction type, bus 
read/ write/access, and data value. 
Qualification of events is based on an 
occurrence counter and an 8-state states- 
machine 
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• Hosted on IBM PC AT* or compatible and 
supporting RS232, RS422 and Ethernet 
operation 

• 1024 frame trace buffer for execution and/or 
bus trace and time tags 

• The on-chip cache does not effect collection of 
the execution trace 

• 256 Kbytes of memory in standalone self-test 
(SAST)unit 

• Real-time bus trace with time-tags for 
tracking code execution time 

• Assembly and disassembly of code in i960 
instruction mnemonics 

ICE to component interconnect includes 
support for surface-mounted and socketed 84- 
pin PLCCD and surface mounted 80-pin 
EIAJ QFP i960 SA/SB and 132-pin PGA for 
1960KA/KB 

The ICE-960 in-circuit emulator provides 
emulation of the i960 SA/SB at speeds to 
16 MHz and the i960 KA/KB at speeds to 
25 MHz, thus providing early detection of 
subtle timing problems that may arise at full 
speed. Intel's intimate knowledge of the 
component makes possible the tightest 
conceivable conformance between timing 
parameters of the emulator and the target 
microprocessor. 

PROCESSOR/MEMORY 
EXAMINA TIONAND 
MODIFICATION 

The i960 registers can be accessed 
mnemonically (e.g. gl2, r5, fp3) with the ICE- 
960 emulator software. Data can be displayed 
or modified in hexadecimal, decimal, octal, or 
binary and by data type (byte, word, etc). 
Program memory contents can be modified as 
i960 assembly instruction mnemonics. 

PROGRAM TRACING 

The ICE-960 emulator can store 1024 frames of 
program execution history processor/address/ 
data bus activity in the trace buffer. Each 
frame of program execution contains a 
discontinuity address (branch, call, return, etc) 
and a time-tag. This information can be used to 
reconstruct a history of the program execution. 
With the execution trace option enabled, the 
ICE-960 will run at less than full speed. Each 
trace frame of bus cycles contains one complete 
bus burst trace. Collection of trace information 
is controlled by a logic analyzer type moving 
trace window and by bus access type. 



EVENT RECOGNITION 
(BREAKPOINT CONTROL) AND 
EMULATION CONTROL 

ICE-960 provides comprehensive event 
recognition capabilities including: two 
hardware and thirty-two software breakpoints 
for instruction execution breakpoints, and use 
of the internal debug registers to recognize 
execution of certain instruction types such as 
branch or call instructions. Bus analysis logic 
provides recognition of external bus addresses 
qualified by read, write, or access type as well 
as data values. The data values may be entered 
as masked values and qualified by type. Two 
synchronization lines are provided for 
recognition of external events. ICE-960 also 
provides qualification of events based on an 
occurrence counter or by a recognition 
sequence of up to 8 events; Additionally, 
emulation can be automatically stopped when 
the trace buffer is full. Besides the ability to 
execute program code at full speed between 
specified points, the ICE-960 emulator provides 
the capability to single-step through program 
code. 

RELOCATABLE EXPANSION 
MEMORY 

An optional board provides ICE-960 with 2 
Mbytes of relocatable expansion memory 
which allows users to develop applications 
either before the target system memory is 
working, or in place of ROM or EPROM to 
speed the debugging cycle. This memory can be 
mapped in two separate 1 Mbyte partitions on 
1 Mbyte boundaries. 

For the new ICE960KBREM board, the 
memory waitstate pattern is (3,1,1,1) when the 
users system does not return RDY# for 
accesses in the mapped area. For accesses 
where the user system does return RDY# for 
these areas, the waitstate pattern will be the 
larger of (3,1,1,1) or user waitstate pattern plus 
(2,2,2,2). For either board, the size and shape of 
the board is identical to the ICE probe and is 
installed between the probe and the user's 
target system when in use. The memory 
configuration can be mapped via an ICE MAP 
command. 

The ICE960KBREM/ICE960SBREM cards add 
some constraints when used with the ICE in a 
users target system. First, users should qualify 
bus drivers/buffers with DEN# in order to 
eliminate potential bus conflict between the 
REM board and their target memory while 
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using the ICE. Second, the 1M Byte partition 
size can not be reduced and may effect the 
design of the users memory subsystem. Third, 
the REM boards delay the ADS# and DEN# 
signals by 5 ns (typical) and delays the RDY# 
signal by 4 ns (typical). Fourth, it adds loading, 
capacitance, and power requirements as shown 
in tables 3 and 4. 

STANDALONE OPERATION 

Product software can be developed and 
debugged prior to and independent of 
hardware availability with the Standalone Self 
Test unit (SAST), which contains 256 Kbytes of 
two wait-state program memory. The SAST 
also provides diagnostic testing to assure full 
functionality of the ICE-960 emulator. 

VERSATILE AND POWERFUL 
HOST SOFTWARE 

ICE-960 provides an easy-to-use human 
interface which utilizes color forms to 
complement a powerful command set. The 
software includes: an on-line help facility, a 
dynamic command entry and syntax guide, 
screen oriented editor, assembler and 
disassembler, input/ output redirection, 
command piping, DOS command entry, and 
the ability to customize the command set via 
debug procedures and literal definitions. 

DEBUG PROCEDURES AND 
LITERALS 

Debug procedures (PROCs) are user-defined 
groups of ICE960 emulator commands. They 
can be stored on disk and recalled during later 
debugging sessions. PROCs can be used to 
simplify the process of debugging by grouping 
repetitive emulator commands, which can then 
be accessed by typing the name of the PROC. 
Literals are user-defined abbreviations for 
whole or partial ICE-960 emulator commands. 
Literals are a shorthand method of 
customizing the emulator commands to fit 
your needs and preferences. 



ICE TO COMPONENT 
INTERCONNECT SYSTEM 

Using the On-Circuit Emulation (ONCE) i960 
SA/SB silicon feature, ICE960SB can be used 
in systems with surface-mounted i960 SA/SB 
components in either PLCC or EIAJ QFP 
packages. The hinge cable adapters included in 
the various ICE kits and pictured to the right, 
are placed directly on top of the surface 
mounted i960 SA/SB device. The circuitry 
necessary for the emulator to take control 
from the target processor is fully supported in 
the emulator. No additional circuitry is 
required. 

Of course, socketed support for i960 SA/SB 
components in PLCC packages, or i960 KA/KB 
components in PGA packages are also 
supported. Please see Figures 1, 2, 3, and 4 for 
ICE Probe physical characteristics. Refer to 
Table 5 for hinge cable loading and delay 
characteristics. 

WORLDWIDE SERVICE, 
SUPPORT, AND TRAINING 

To augment its development tools, Intel offers 
a full array of seminars, classes, workshops, 
field application engineering expertise, hotline 
technical support, and on-site service. 

Intel also offers a Software Support contract 
which includes technical software information, 
automatic distributions of software and 
documentation updates, (COMMENTS 
publication, remote diagnostic software, and a 
development tools troubleshooting guide. 
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HIGH-SPEED HOST-TO-ICE 

COMMUNICATIONS 

PROTOCOLS 

ICE-960 supports RS232 and RS422 
communications protocols to 115 KBaud and 
1152 KBaud respectively depending upon the 
ability of the host to support the specific rate. 
Testing for these systems and the 
configurations involved are described in the 
following sections. 



Intel's 90-day Hardware Support package 
includes technical hardware information, 
warranty on parts, labor, material, and on-site 
hardware support. 

Intel Development Tools also offers a 30-day, 
money-back guarantee to customers who are 
not satisfied after purchasing any Intel 
development tool. 



SPECIFICATIONS 



HOST REQUIREMENTS 

IBM PC- AT (minimum requirements) with 640 
KBytes of conventional memory 

1 MByte of RAM (Lotus, Intel, Microsoft 
expanded memory specification) 

20 MByte Fixed Disk 

At least one 5%" or 3 x / 2 " Floppy Disk drive 

RS232 or RS422 Communication Interface 

DOS Operating System (version 3.2 or 3.3) 

TESTEDHOST 
CONFIGURATIONS 

IBM PC- AT with DOS 3.3. Tested with built- 
in RS232 and a Quatech DS202 Asynchronous 
RS422 Communications Board with 16550 
Option 

MECHANICAL SPECIFICATIONS 



COMPAQ Deskpro 386* with DOS 3.3. 

Tested with built-in RS232 and Quatech DS202 
Asynchronous RS422 Communications Board 
with 16550 Option 

Systems Based on an Intel 301 /302TM Box 
with DOS 3.3. Tested with built-in RS232 to 
115.2 KBaud and a Quatech DS202 
Asynchronous RS422 Communications Board 
with 16550 Option to 1.152 MBaud 

IBM Personal System/2* with DOS 4.01. 

Tested with built-in RS232 

REQUIRED SYSTEM 
RESOURCES 

The ICE-960 emulator requires the following: 
a) exclusive use of the i960 SA/SB or i960 KA/ 
KB's on-chip debug registers and b) a 
minimum of 256 bytes of target system RAM 
used to flush the i960 local registers. 



TABLE 1. ICE-960 Emulator Physical Characteristics 



Unit 


Width 


Height 


Length 


Weight 


Inches 


cm 


Inches 


cm 


Inches 


cm 


lbs 


kg 


Control Unit 


10.5 


26.7 


1.5 


3.8 


16.0 


40.6 


6.0 


2.72 


Processor Module* 


3.8 


9.6 


1.5 


3.8 


5.0 


12.7 






SAST 


6.0 


15.2 


2.0 


5.1 


8.0 


20.3 


3.5 


1.59 


OIB 


3.8 


9.6 


0.9 


2.3 


5.1 


13.0 






Power Supply 


2.8 


7.1 


4.2 


10.7 


11.0 


27.9 


4.7 


2.14 


User Cable 










22.0 


55.9 






Serial Cables 










12.0' 


3.66m 







* Measurement includes target adaptor 
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Figure 1: ICE960KB25 Processor Module 
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Figure 2: Optional Isolation Board 
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PLCC Hinge Cable Dimensions 

17.5 



Underside of 
PGA Socket 



PLCC Footprint K 



Side View 



IHlllllHllllllil oj 



Side View 



All Measurements in Centimeters 




~~>f 0.6 0.6 

Required Clearance for Surface 
Mount Components 



Figure 3: ICE960SB16C Adapter 
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ELECTRICAL SPECIFICATIONS 

SYNC Line Specification 

The SYNCIN line must be valid for at least one 
instruction cycle because it is only sampled on 
bus access boundaries. The SYNCIN line is a 
standard TTL input. The SYNCOUT line is 
driven by a TTL open collector with a 4.75 Kfl 
pull-up resistor 



AC /DC Specifications 

The Optional Isolation Board (OIB) isolates the 
ICE-960 probe from an untested user target 
system. When the OIB is in use, the ICE-960 
AC and DC specifications differ from the i960 
microprocessor as shown below. When the OIB 
is not installed, the ICE-960KB timing 
specifications are identical to those of the i960 
component. 



TABLE 2. AC Specifications with the OIB Installed 



Symbol* 


Parameter 


16 MHz 
80960SB 


25 MHz 
80960KB 


Min 


Max 


Min 


Max 


Tl 


Clock Period 


32 ns 


125 ns 


20 ns , 


125 ns 


T2 


Clock Low Time 


9 ns 




6 ns 




T3 


Clock High Time 


9 ns 




6 ns 




T4 


Clock Fall Time 




10 ns 




10 ns 


T5 


Clock Rise 




10 ns 




10 ns 


T6 


Output Valid Delay 
A(2:3), BE#(0:1), BLAST#,* 
DEN#,DTR#,WR#** 
A/D Lines*** 




40 ns 
40 ns 




33 ns 
33 ns 


T6AS 


AS Valid Delay (AS #) 




36 ns 




33 ns 


T7 


ALE# Width 


16 ns 




12 ns 




T8 


ALE# Valid Delay 




36 ns 




33 ns 


T9 


Output Float Delay 
A(2:3), BE#(0:1), BLAST#,* 
DEN#,DTR#,WR#** 
A/D Lines 




50 ns 
50 ns 




35 ns 
40 ns 


T10 


Input Setup 1 
HLDA, INT0#, INT1, INT2, INT3# 


13 ns 




6 ns 




Til 


Input Hold 
HLDA, INT0#, INT1, INT2, INT3# 
HOLD, READY#, LOCK# 


10 ns 
10 ns 




13 ns 
13 ns 




T12 


Input Setup 2 
HOLD, READY #, LOCK# 


17 ns 




11ns 




T13 


Setup to ALE# Inactive 


7 ns 




7 ns 




T14 


Hold after ALE # 


5 ns 




5 ns 




T15 


RESET # Hold 


4 ns 




4 ns 




T16 


RESET # Setup 


4 ns 




4 ns 




T17. 


RESET# Width 


1281 ns 




820 ns 






*Tplh dependent on termination for KB control signals 

* OIB does not float A/D bus during T r and Ti (between bus cycles) 

♦Output Valid Delay for control signals after HOLD ACKNOWLEDGE is deasserted 50 ns for 80960SB and 43 ns for 80960KB 
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TABLE 3. ICE-960 Emulator DC Specifications 





ICE Probe 


OIB 


REM 


Processor Speed 


ICE960SB 


1.4 


0.4 


0.5 


16 


ICE960KB 


1.4 


0.6 


0.7 


25 



TARGET SYSTEM DESIGN 
CONSIDERATIONS 

In addition to the mechanical, power 
consumption, and signal loading 
considerations for the ICE probe, the following 
points should be taken into account when the 
target system is being designed: 

1) [SA/SB/KA/KB/MC] 

The AD bus should not be driven by an 
external source unless DEN# is asserted. 

2) [SA/SB/KA/KB/MC] 

The LOCK # signal must be terminated as 
recommended in the 80960SA/SB 
component data sheet. 



3) [SA/SB/KA/KB/MC] 

To guarantee timings, the ICE requires 
± 5% supply voltage to the target system 
(i.e., ICE probe power). 

4) [SA/SB] 

To ensure correct bus trace the ICE requires 
a data hold time (Til) of 4 ns. 

5) [SA/SB/KA/KB/MC] 

Each Vqc and GND pin of the processor 
must be connected to the appropriate 
voltage or ground and externally strapped 
close to the package. 

6) [SA/SB/KA/KB/MC] 

Processor no connect (N.C.) pins must be 
left disconnected. 
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TABLE 4. Additional DC Loading 



Signal 


(ICE Probe) 


(OIB) 


(KB REM) 


(SB REM) 


IlH 

Max 


IlL 

Max- 


IlH 

Max 


IlL 

Max 


IlH 

Max 


IlL 

Max 


Iffl 

Max 


IlL 

Max 


AD(0:31) 


25juA 


25juA 


15jaA 


-15jliA 


120 /xA 


0.7 mA 


20juA 


100 /xA 


ADS# 


25jllA 


25fxA 


115 juA 


-15jmA 


Driven by 74AS760 
w/ 4.7k Pull-Up 


10 /xA 


IOjllA 


DEN# 


25jllA 


25juA 


115 /xA 


-15/xA 






W/R# 


25 juA 


25 /xA 


115 /xA 


-15jliA 


150 juA 


1.7 mA 


IOjliA 


10/xA 


CLK2 


50jxA 


500 juA 


25 fxA 


-25 /x A 


130 juA 


2.9 mA 


,20/xA 


1600 jli A 


RESET 


25juA 


250 /xA 


45|llA 


-750 ju A 


250jllA 


0.3 mA 


IOjllA 


IOjliA 


BE(0:3)# 


25 /xA 


25jtxA 


115 juA 


-15juA 


10 jmA 


0.1mA 


10 /xA 


IOjliA 


READY # 


25jllA 


25julA 


45juA 


-750 jit A 


750 /xA 


0.8 mA 


25jllA 


260jllA 


ALE# 


25juA 


25juA 


15 /xA 


-15/xA 


20ju,A 


0.5 mA 


IOjuA 


1600 jutA 


DT/R# 


25/xA 


25juA 


115 juA 


-15]ixA 










INT(0:3) 


25juA 


25juA 


15jaA 


-565fxA 










BADAC# 


25jllA 


25jliA 


15 /xA 


-565fxA 










LOCK# 


25jliA 


25jllA 


140 juA 


-500jixA 










HOLD 


25 /xA 


25jilA 


45/xA 


-750^1 A 










FAILURE # 


25jllA 


25jnA 


20 /xA 


-1mA 











TABLE 5. 80960SB PLCC Hinge Cable Loading and Delay 




Signal Loading 



Signal Delay 



15 pF Typical 



Signals from Processor delayed 4 ns typical, Setup and Hold Timings unaffected. 
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Order Code 

ADPT80EIAJ 



ADPT84PLCC 



ICE960SB16C 



ICE960SB16J 



ICE960KB25 



Description 

Hinge Cable Adapter for 
surface-mount i960SB EIAJ 
QFP packages. This adapter is 
included in the ICE960SB16J 
kit. 

Hinge Cable Adapter for 
surface-mount and socketed 
i960SB PLCC packages. This 
adapter is included in the 
ICE960SB16C kit. 

ICE960 base, i960 SA/SB 
probe, 84-pin PLCC surface- 
mount and socketed target 
component interconnect, and 
RS232 and RS422 
communication cables. 
(Shrink- Wrap license, Class 1) 

ICE960 base, i960 SA/SB 
probe, 80-pin EIAJ surface- 
mount target component 
interconnect, and RS232 and 
RS422 communication cables. 
(Shrink- Wrap license, Class 1) 

ICE960 base, i960 KA/KB 
probe, 132-pin PGA target 
component interconnect, and 
RS232andRS422 
communication cables. 
(Shrink- Wrap license, Class 1) 



Order Code Description 

ICE960SBREM Optional 2 MByte Relocatable 
Expansion Memory Board for 
i960 SA/SB components. 

ICE960KBREM Optional 2 MByte Relocatable 
Expansion Memory Board for 
80960KA/KB components. 

PTOI960SB16 Probe and Software to convert 
ICE960KB25 to ICE960SB16. 
An ADPT80EIAJ or 
ADPT84PLCC adapter kit 
should also be ordered with 
this package to support the 
component packaging type of 
your choice. (Shrink- Wrap 
license, Class 1) 

PTOI960KB25 Probe and Software to convert 
ICE960SBl6Cor 
ICE960SB16J to ICE960KB25. 
(Shrink-Wrap license, Class 1) 
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IN-CIRCUIT EMULATOR FOR THE 80960MC 
MICROPROCESSOR 

The ICETM-960MC In-circuit Emulator delivers real-time hardware and software 
debugging capabilities for 80960MC based designs. Features include emulation of the 
80960MC microprocessor, powerful breakpoint specification, fastbreaks, optional 
relocatable expansion memory, two types of trace capability, large trace buffering, 
support of virtual and physical component addressing modes, and sophisticated human 
interface. The ICE-960MC In-circuit Emulator gives you unmatched control over all 
phases of hardware/ software debug, including developing, integrating and testing, which 
improves development productivity and speeds time to market. 

FEATURES 



^■W^ 
5 w :{ ;: 
&<$#& 



• Real-Time Emulation of the 80960MC 
microprocessors up to 20 MHz (25 MHz 
optional) 

• Full Symbolic Information Relating to 
Code. Data symbolics subject to some 
limitations in virtual addressing mode 

• Optional ICE960KBREM Board Provides 
2 Mbytes of ICE Memory Which Can 
Overlay User ROM or RAM. 

• Zero wait-state operation from user 
memory 

• Examine and modify Memory and the 
80960 Registers 



Breakpoint Capabilities include: 

Execution Address, Instruction Type, 

Bus Read/Write/Access, and Data 

Value. Qualification of Events is Based 

on an Occurrence Counter and an 8 state 

State-Machine 

Hosted on IBM PC AT or compatible 

Dynamically monitor or update program 

variables or memory during emulation 

with Fastbreaks 

1024 Frame Trace Buffer for execution 

and/ or Bus Trace and time tags 

256 Kbytes of Memory in Standalone 

Self-Test (SAST) Unit 
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ICETM-960MC IN-GIRCUIT EMULATOR 



REALTIME EMULATION 

The ICE-960MC In-circuit Emulator provides 
emulation of the 80960MC at speeds up to 20 
MHz (25 MHz optional), thus providing early 
detection of subtle timing problems. Intel's 
intimate knowledge of the component makes 
possible the tightest conceivable conformance 
between timing parameters of the emulator 
and the target microprocessor. 

PROCESSOR/MEMORY 
EXAMINATION AND 
MODIFICATION 

The 80960MC registers can be accessed 
mnemonically (e.g. g!2, r5, fp3) with the ICE- 
960MC emulator software. Data can be 
displayed or modified in one of four bases 
(hexadecimal, decimal, octal, or binary) and by 
data type (byte, word, etc). Program memory 
contents can be disassembled and displayed as 
80960 assembly instruction mnemonics. 
Additionally, 80960 assembly instruction 
mnemonics can be assembled and stored into 
program memory. 80960MC system data 
structures such as the segment table, dispatch 
port, and page tables can also be accessed and 
modified mnemonically. 

PROGRAM TRACING 

The ICE-960MC emulator can store 1024 
frames of program execution history or 1024 
frames of the 80960MC address/data bus 
activity in the trace buffer. Each frame of 
program execution contains a discontinuity 
address (branch, call, return, etc) and a time- 
tag. This information can be used to 
reconstruct a history of the program execution. 
With the execution trace option enabled, the 
ICE-960MC will run at less than full speed. 
Each trace frame of bus cycles contains one 
complete bus burst trace. Collection of trace 
information is controlled by a logic analyzer 
type moving trace window and by bus access 
type. v 

EVENT RECOGNITION 
(BREAKPOINT CONTROL) AND 
EMULATION CONTROL 

ICE-960MC provides comprehensive event 
recognition capabilities including: two 
hardware and thirty-two software breakpoints 
for instruction execution breakpoints, and use 
of the internal debug registers to recognize 
execution of certain instruction types such as 



branch or call instructions. Bus analysis logic 
provides recognition of external bus addresses 
qualified by read, write, or access type as well 
as data values which may be entered as 
masked values. Two synchronization lines are 
provided for recognition of external events. 
ICE-960MC also provides qualification of 
events based on an occurrence counter or by a 
recognition sequence of up to 8 events. Special 
additions for the 80960MC include the ability 
to recognize process binds. Additionally, 
emulation can be automatically stopped when 
the trace buffer is full. Besides the ability to 
execute program code at full speed between 
specified points, the ICE-960MC emulator 
provides the capability to single-step through 
program code. 

RELOCATABLE EXPANSION 
MEMORY 

An optional board provides ICE-960MC with 2 
Mbytes of relocatable expansion memory 
which allows users to develop applications 
either before the target system memory is 
working, or in place of ROM or EPROM to 
speed the debugging cycle. This memory can be 
mapped in two separate 1 Mbyte partitions on 
1 Mbyte boundaries. The memory waitstate 
pattern is (3,1,1,1) when the user's system does 
not return RDY# for accesses directed to the 
ICE960KBREM board. For accesses where the 
user system does return RDY# the waitstate 
pattern will be the larger of (3,1,1,1) or user 
waitstate pattern plus (2,2,2,2). The size and 
shape of the board is identical to the ICE probe 
and is installed between the probe and the 
user's target system when in use. The memory 
configuration can be mapped via either an ICE 
MAP command or via switches on the 
ICE960KBREM board. 

The ICE-960KBREM card adds some 
constraints when used with the ICE in a user's 
target system. First, users should qualify bus 
drivers/buffers with DEN # in order to 
eliminate potential bus conflict between 
REM960 and their target memory. Second, the 
1 Mbyte partition size can not be reduced and 
may effect the design of the user's memory 
subsystem. Third, ICE960KBREM delays the 
ADS # and DEN # signals by 5 nsec (typical) 
and delays the RDY# signal by 2 nsec (typical). 
Fourth, it adds loading, capacitance, and power 
requirements as shown in tables 3 and 4. 
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STANDALONE OPERATION 

Product software can be developed and 
debugged prior to and independent of 
hardware availability with the Standalone Self 
Test unit (SAST), which contains 256 Kbytes of 
two wait-state program memory. The SAST 
also provides diagnostic testing to assure full 
functionality of the ICE-960MC emulator. 

VERSATILE AND POWERFUL 
HOSTSOFTWARE 

ICE-960MC provides an easy-to-use human 
interface which utilizes color and pull-down 
menus to complement a powerful command 
set. The software includes: an on-line help 
facility, a dynamic command entry and syntax 
guide, screen oriented editor, assembler and 
disassembler, input/output redirection, 
command piping, DOS command entry, and 
the ability to customize the command set via 
debug procedures and literal definitions. 



Special software commands are provided to 
display, interpret, and modify the 80960MC 
hardware data structures including the 
segment table, dispatch port, process control 
block, and the page tables and directories. 

DEBUG PROCEDURES AND 
LITERALS 

Debug procedures (PROCs) are user-defined 
groups of ICE-960MC emulator commands. 
They can be stored on disk and recalled during 
later debugging sessions. PROCs can be used to 
simplify the process of debugging by grouping 
repetitive emulator commands, which can then 
be accessed by typing the name of the PROC. 
Literals are user-defined abbreviations for 
whole or partial ICE-960MC emulator 
commands. Literals are a shorthand method of 
customizing the emulator commands to fit 
your needs and preferences. 
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WORLDWIDE SERVICE, 
SUPPORT, AND TRAINING 

To augment its development tools, Intel offers 
a full array of seminars, classes, and 
workshops, field application engineering 
expertise, hotline technical support, and on- 
site service. 

Intel also offers a Software Support package 
which includes technical software information, 



telephone support, automatic distribution of 
software and documentation updates, access to 
the "ToolTalk" electronic bulletin board, 
"iComments" publication, remote diagnostic 
software, and a development tools 
troubleshooting guide. 

Intel's Hardware Support package includes 
technical hardware information, telephone 
support, warranty on parts, labor, material, 
and on-site hardware support. 



SPECIFICATIONS 



HOST REQUIREMENTS 

IBM PC AT (minimum requirements) with 640 
KB of conventional memory 

• 1 MB of RAM (Lotus, Intel, Microsoft 
expanded memory specification) 

• 20 MB Fixed Disk 

• At least one 5-y 4 " Floppy Disk drive 

• A serial interface 

• DOS Operating system (version 3.2 or later 
excluding 4.x) 

Mechanical Specifications 



REQUIRED SYSTEM 
RESOURCES 

The ICE-960MC emulator requires the 
following: a) exclusive use of the 80960MC's on- 
chip debug registers and b) a minimum of 256 
bytes of target system RAM used to flush the 
80960 local registers. 



TABLE 1. ICE-960MC Emulator Physical Characteristics 




Width 


L 


Height 


t 


Length 


Weight 


Unit 


Inches 


cm 


Inches 


cm 


Inches cm 


lbs kg 


Control unit 


10.5 


26.7 


1.5 


3.8 


16.0 40.6 


6.0 2.72 


Processor module* 


3.8 


9.6 


1.5 


3.8 


5.0 12.7 




SAST 


6.0 


15.2 


2.0 


5.1 


8.0 20.3 


3.5 1.59 


OIB 


3.8 


9.6 


.9 


2.3 


5.1 13.0 




Power supply 


2.8 


7.1 


4.2 


10.7 


11.0 27.9 


4.7 2.14 


User cable 










22.0 55.9 




Serial cable 










12.0 ft 3.66m 





* measurement includes target adaptor 
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SIDEVIEW 
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Figure 1: Processor Module 



TOPVIEW 
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OPTIONAL ISOLATION BOARD 
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2 PL 



Figure 2: Optional Isolation 
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ELECTRICAL SPECIFICATIONS AC/DC Specifications 

OVA rn T • o -r- 4.- The Optional Isolation Board (OIB) isolates the 

SYNC Line Specification TO ™ n Ln, n u r ' , , , , 

mi ovMnTMT i.u tj r i. i j. ICE-960MC probe from an untested user target 

The SYNCIN line must be valid for at least one . ™ F . * q™ . . ,, t^ e & 

instruction cycle because it is only sampled on nflnn/rn'An j ™~» >c j.- ' j-rr c 

• , ,. i* 7 , . r™ ovTVTOTTVTT • 960MC AC and DC specifications differ from 

instruction boundaries. The SYNCIN line is a ., 0™*™*^ • i_ u i 

*. j j ovriT • j. rrii. ovMnnTTmr • the 80960MC microprocessor as shown below, 

standard TTL input. The SYNCOUT line is w , ,, mD . f. . „ , , u tpp oenA^n 

j • u rnovr n j. -j-t. ^ ncrr When the OIB is not installed, the ICE-960MC 

driven by a TTL open collector with a 4. 75K- .- ,. ., ,. ,, \ «., 

, „ ., specifications are identical to those of the 

ohm pull-up resistor. 80960MC component. 



TABLE 2. AC Specifications With The OIB Installed 


Symbol* 


Parameter 


Minimum 


Maximum 


t2 


clock low time 


2 + lnS 




t3 


clock high time 


3+ Ins 




t6 


output valid delay 








A/D 0:31 


6 + 8ns 


t6+16Ns 




DT/R#, DEN#, BE0-3#, ADS#, W/R# 


6 + 7nS 


t6+14ns 




HLDA, CACHE, LOCK#, INTA# 


6 + 6ns 


t6 + 8nS 




ALE# 


6+10nS 


t6 + 20nS 


t7 


ALE# width 


7-6.5nS 




t8 


ALE # disable delay 


8 + nS 


t8+14nS 


t9 


output float delay 








A/D 0:31 


t9 + 5nS 


t9 + 22nS 




DT/R#, DEN#, BE0-3#, ADS#, W/R# 


t9 + 7nS 


t9+.15ns 




HLDA, CACHE, LOCK#, INTA# 


t9 + 6nS 


t9 + 8nS 


tio 


input setup 1 








A/D 0:31 


tlO + 2nS 






BADAC#, INT0-3# deassertion 


tlO+14nS 




til 


input hold 








A/D 0:31, HOLD 


tll + 6nS 






BADAC#,INT0-3#, 








READY # 


tll + 7nS 




tl6 


reset setup time 


. 16 + 6 





* symbol refers to 80960MC specification 



TABLE 3. ICE-960MC Emulator DC Specifications 



Symbol* Parameter Maximum 



PM-Icc Supply current with 80960KB-20 1400mA 

OIB-Icc Supply current PM-Icc + 1100mA 

REM-Icc Supply current PM-Icc + 1300mA (1700 Total Typical) 
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TABLE 4. Additional DC Loading 




(without 


(with 


(with 




OIB installed) 


OIB installed) 


REM installed) 




lih 


lil 


lih 


lil 


lih lil 


Signal 


Maximum 


Maximum 


Maximum 


Maximum 


Maximum Maximum 


AD (0:31) 


100 uA 


0.6 mA 


20 uA 


-1mA 


120 uA 0.7 mA 


ADS# 


140 uA 


1.6 mA 


20 uA 


-1mA 


Driven by 74AS760 


DEN#' 


40 uA 


1.0 mA 


20 uA 


-1mA 


w/ 4.7k pull-up 


W/R# 


140 uA 


1.6 mA 


20 uA 


-1mA 


150 uA 1.7 mA 


CLK2 


80 uA 


2.2 mA 


50 uA 


-2 mA 


130 uA 2.9 mA 


RESET 






50 uA 


-2 mA 


250 uA 0.3 mA 


BE (0:3) # 






20 uA 


-1mA 


10 uA 0.1 mA 


READY # 






20 uA 


-1mA 


750 uA 0.8 mA 


ALE# 






20 uA 


-1mA 


20 uA 0.5 mA 


DT/R# 






20 uA 


-1mA 




INTO#,INT3# 






20 uA 


-1mA 




INT1, INT2 






20 uA 


-1mA 




BADAC# 






20 uA 


-1mA 




LOCK# 






20 uA 


-1mA 




HOLD 






20 uA 


-1mA 




FAILURE # 






20 uA 


-1mA 
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POWER SUPPLY 

100- 120V or 220-240V (Selectable) 

50-60 Hz 

2 amps (AC Max) @ 120V 

1 amp (AC Max) @ 240V 

ENVIRONMENTAL 
CHARACTERISTICS 

Operating Temperature 10°C to 40°C 

(50°Ftol04°F) 

Operating Humidity Maximum 85% 

Relative Humidity, 
non-condensing 



ORDERING INFORMATION 



Order Code 

ICE960MC 



Description 

The complete 20 MHz ICE- 
960MC emulator system 
including control unit, 
processor module, power 
supply, SAST, OIB, SAB, 
serial communications cable 
(SCOM4), IEDIT, V1.0 
software. (Requires software 
license, Class I) 

ICE960MC25P 25 MHz ICE960MC as 
described above 

I960MCUPG Conversion kit to convert 
ICE-960KB to ICE-960MC. 
Consists of new host and 
probe software, probe 
firmware, and manual. 
Requires ICE-960KB V2.0 
hardware. 

ICE960KBREM Optional 2 Mbyte Relocatable 
Expansion Memory Board. 
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LOW COST EVALUATION TOOL 

The QT960 products give you a 32-bit starter kit to begin software evaluation and 
hardware design at a low cost. The boards feature the 20 MHz 80960KB 32-bit embedded 
processor. The 80960KB has integrated floating point, instruction and register caches, 
and an on-chip interrupt controller. The 80960K-series are the first in a new 
architectural family of embedded processors from Intel built using Intel's CHMOS IVt 
process. These boards provide you with full access to the features of the 80960KB 
processor. A wire wrap prototyping area offers you easy access to board features to test 
your designs. Interleaved EPROM means fast execution of your code taking advantage of 
the 80960KB's burst bus. A programmable wait state generator simulates different 
memory environments useful in evaluating the performance of your code. These features 
make the QT960 boards useful low cost tools for the 32-bit embedded designer. 

Once written, you can debug your program with NINDY, an EPROM resident debug 
monitor. NINDY enables you to download code, set seven different trace modes, display 
and modify memory or registers, and disassemble problem code sequences. 

Available separately from Intel are the ASM-960 (assembly language) and iC-960 (high- 
level language) products which provide you with the code development environment for 
the QT960 boards. 

The starter kit comes in two versions: the QT960F version has fast SRAM, high speed 
EPROM and Flash memory; the QT960E version has lower cost SRAM, Flash memory 
and no high speed EPROM. Each version has NINDY in either EPROM (QT960F) or 
Flash memory (QT960E), power supply cable, and the QT960 User Manual. Both versions 
also include the parts list, source code of the debug monitor, and the board data base 
(schematics) all on diskette. Armed with this starter kit you now have a system to 
evaluate and prototype your product ideas quickly and at low cost. 
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FEATURES 



QT960 FEATURES 

• 20 MHz Execution Speed • Display/ Modify Memory and Registers 

• 128K Bytes to Zero Wait State EPROM* • Code Disassembly 

• 128K Bytes of Flash Memory • High Level Language Support 

• 128K Bytes of Zero Wait State SRAM* • RS-232 Communications Link 

• Programmable Wait State Generator • The QT960E Version has 128K Bytes of Two 

• Prototyping Wire Wrap Area Wait State SRAM and 128K Bytes Four Wait 

• Five Instruction Traces State Flash Memory 

• Two Hardware Breakpoints 

Product Order Codes: EVQT960F20 and EVQT960E20 

tCHMOS IV is a patented Intel process. 
:1:QT960F Version only. 

FAST AND EASY CODE UPDATES 

128K Bytes of Intel's 28F256 Flash memory provides an easy and quick method of changing your 
code in nonvolatile memory. Flash memory may be conveniently reprogrammed without 
removing it from the board while software is under development. 

FASTEPROM 

Interleaved fast EPROM (Intel's 27C202) on the QT960F version yields one-zero-zero-zero wait 
state code access. It efficiently utilizes the four word burst capabilities of the 80960KB bus 
maximizing program performance. 

PROTOTYPING SUPPORT 

A prototyping wire wrap area is provided on board with access to the system's signals and buses. 
This area gives you access to the board's features and allows you to easily test design ideas. A 
system bus connector is also provided for off board prototyping. 

PROGRAMMABLE WAIT STATE GENERATOR 

A software programmable wait state generator enables you to quickly model various memory 
speeds. Under software control you can set over 16 different wait state combinations and evaluate 
the performance of your target system. 

DMA 

The board offers you eight DMA channels accessed through a NINDY library function using 
Intel's 82380. In addition, off board connectors provide DMA I/O capabilities. 

FIVE INSTRUCTION TRACES AND TWO HARDWARE 
BREAKPOINTS 

NINDY utilizes the built-in trace capabilities of the 80960KB to provide you with single step, 
supervisor, call, return, and branch instruction tracing offering you extensive debug capabilities 
for software examination and modification. Two hardware breakpoints enable you to break on 
and examine EPROM resident code. 
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FEATURES 



HIGH LEVEL LANGUAGE SUPPORT 

NINDY is capable of downloading absolute object code generated by ASM-960 or iC-960. ASM-960 
and iC-960 may be purchased separately from Intel. 

COMMUNICATION AND SOFTWARE REQUIREMENTS 

The QT960 boards communicate with the host through the RS-232 link using an Intel 82510 
UART provided on board. The boards support five baud rates: 1200, 2400, 9600, 19200, and 38400. 
The default is 9600 baud. To communicate with the QT960 boards you must meet the following 
minimum software requirements: 



o Terminal Emulator 



o XMODEM Download Capabilities 
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Block Diagram of the QT960 Board 

For information or the number of your nearest sales office call 800-548-4752 (U.S. and Canada). 

Intel Corporaton, Literature Department, 3065 Bowers Avenue, Santa Clara CA 95051, United States. Tel: 408-987-8080. 
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DB960CADIC IN-CIRCUIT DEBUG MONITOR 




DB960CADIC 



Inters DB960CADIC, the in-circuit debug monitor for the 33 MHz i960CA embedded 
microprocessor, represents a new generation of development tool technology . 

DB960CADIC allows users to debug high-speed, cached applications at the full speed of 
the i960CA target processor. Controlled by Intel's DB interface, DB960CADIC offers the 
user a tool with a powerful feature set at a fraction of the cost of traditional development 
tools. DB960CADIC is designed to improve productivity by allowing the user to debug 
software before and after the target system arrives, with minimal hardware intrusion. 

Features 



• Real-time emulation of the i960CA 
embedded microprocessor at speeds up to 
33 MHz 

• Full development and debug support for 
i960CA on-chip cache and RAM 

• Minimal intrusive operation, allowing 
the user to debug the target system with 
minimal modification subject to initial 
design constraints 

• Breakpoint capabilities include ten 
software breakpoints, two hardware 
execution address breakpoints, and two 
hardware data address breakpoints. The 
human interface supplements these 
breakpoints with the ability to break on 
data values, conditions, and a four-state 
state machine in non-real time. 



Low-Cost 

Source-Level, Symbolic Debugging in a 

Windowed Human Interface with pull 

down Menus (DB). This interface is 

consistent across i960CA tools. 

128K Bytes User Memory 

Virtual I/O, the ability to perform I/O 

between the DB960CADIC unit and the 

host 

In-Circuit operation facilitates easy 

transition between target systems 

Optional Stand-Alone Self-Test 

(DB960CASAST) Module 

Optional Logic Analyzer Interface Board 

(LAI960CA) 
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Full-Speed Debug and 
Development 

The DB960CADIC In-Circuit Debug Monitor 
provides sophisticated real-time hardware and 
software debug capabilities for i960CA 
embedded microprocessor-based designs. The 
user can run at the full speed of the target 
processor, ensuring that elusive timing bugs 
will be found. The DB960CADIC is jumpered to 
receive a clock pulse from either the user's 
target system, or from an internal 25 MHz 
clock. 

Ideal for All Stages of 
Development 

DB960CADIC can be used by both hardware 
and software developers, at any stage of design. 
Early in the development process, 
DB960CADIC allows software debugging when 
inserted into an existing i960CA board such as 
the DB960CASAST module or the EV80960CA 
board. Later in the design cycle, DB960CADIC 
can be inserted into the user's target system, 
thus facilitating debug of hardware/software 
integration. 

Speed Development with Source 
Code, Symbolic Debugging 

Using source code oriented debugging in a 
windowed, symbolic interface, software 
engineers can increase productivity by 
debugging in the medium they are familiar 
with, software source. 

Commands can be entered via either function 
keys, pull-down menus which group logically 
related commands, or a supplementary 
command line which allows entry of complex 
conditions. In addition, source code symbolics 
can be used to examine and modify memory 
and registers. Optimal symbolic debugging can 
be achieved when using DB960CADIC with 
genuine Intel languages. 

Powerful Break Capabilities 

DB960CADIC provides complex emulation 
control by utilizing the on-chip debug registers 
within the i960CA. Real-time break 
capabilities include the ability to break on any 
two execution addresses or data access 



addresses in hardware. Software breakpoints 
are also used to supplement the hardware 
breakpoints for RAM-based memory 
subsystems. DB960CADIC extends these 
capabilities by providing the ability to break 
on data values, NOT data values, or 
combinations of the above in a four-state state 
machine. More complex conditions such as 
breaking when a variable is less than a certain 
value can be entered via a very flexible feature 
called conditional breakpoints. 

128K Bytes User Memory 

DB960CADIC provides the user with 128K 
bytes of memory in Region F of the i960CA 
target space. Since the debug monitor is also 
placed in Region F, the on-chip bus interface 
unit of the i960CA is configured to address 
region F as byte-wide memory with 5 
waitstates and no burst accesses allowed. 

Virtual Input/Output 

DB960CADIC is shipped with documented 
library calls which provide users with a built- 
in mechanism of performing target I/O using 
the host system. These libraries provide the 
ability to simulate I/O operations in the target 
system before target hardware is available. 

High Speed Serial Link 

Communication between a host and the 
DB960CADIC module is supported via RS232 
and RS422 communication links. RS232 allows 
access to industry standard serial protocols 
while the RS422 link provides a higher speed 
communication mechanism currently 
emerging in the development market. PC/AT 
Compatible RS422 communication boards are 
available from various third party vendors. 

Optional Stand-Alone Self Test 
Chassis 

An optional stand-alone self test chassis 
complements DB960CADIC by allowing the 
user to debug and test code before prototype 
hardware is available. The DB960CASAST 
includes self-test circuitry to ensure that the 
DB960CADIC unit is working correctly. It also 
provides 4 Megabyte of DRAM to be used for 
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developing applications. This memory has a 
(3,1,1,1) waitstate pattern at 25 MHz. This 
waitstate pattern is programmable using the 
bus controller unit in the i960CA. It also 
includes an 8254 programmable timer which 
can optionally interrupt the i960CA processor 
and provide the ability to time code sequences. 

Optional Logic Analyzer Interface 
Board 

The LAI960CA board provides access to 
i960CA pins by routing the signals to easily 
accessible stake pins while passing them 
through to the target system. 

Software Completes the System 

Intel provides a comprehensive software 
development environment to complement 
DB960CADIC. This environment includes C 
and ASM source languages, a retargetable 
debug monitor, and DB960CADIC. The 
languages support the entire range of 80960 
embedded processors. 



Worldwide Service. Support, and 
Training 

To augment its development tools, Intel offers 
a full array of seminars, classes, workshops, 
field application engineering expertise, hotline 
technical support, and on-site service. 

Intel also offers a Software Support contract 
which includes technical software information, 
telephone support, automatic distributions of 
software and documentation updates, 
(COMMENTS publication, remote diagnostic 
software, and a development tools 
troubleshooting guide. 

Intel's 90-day Hardware Support package 
includes technical hardware information, 
telephone support, warranty on parts, labor, 
material, and on-site hardware support. 

Intel Development Tools also offers a 30-day, 
money-back guarantee to customers who are 
not satisfied after purchasing any Intel 
development tool. 



DB960CADIC SPECIFICATIONS AND REQUIREMENTS 



Host System Requirements 

Host system requirements to run the in-circuit 
debugger include the following: 

—DOS version 3.2 or later excluding DOS 4.0 

— 640 bytes of RAM in conventional memory 

—A 20 MB hard disk 

—An RS232 or RS422 Serial Port 

Evaluated Systems include: 

IBM PC-AT* with DOS 3.3 

COMPAQ 386* with DOS 3.3 



Intel 301/302* with DOS 3.3 

IBM Personal System/2* Model 70/80 with 
DOS 4.01 

Environment Characteristics 

Operating Temperature: + 10°C to + 40°C 
(50°Ftol04°F) 

Operating Humidity: Maximum of 90% 
relative humidity, 
non-condensing. 
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DB960CADIC Interface 
Considerations 

Target systems intended to receive 
DB960CADIC must meet the following 
requirements: 

• The target system must not respond to 
memory accesses in Region F (0F0000000- 
OFFFFFFFF) with DB960CADIC installed. 
DB960CADIC provides an ACTIVE out 
signal which can be used to qualify bus logic 
to prevent this occurrence when 
DB960CADIC is installed. 

• The Target System must provide 1.3 Amps of 
power (worst case) .9 Amps average to power 
the DB960CADIC unit. 

• Use of one of the nine directly accessible 
i960CA interrupts. 

• Use of interrupt table entry 242 or 248. 

• Additional Signal Loading as follows: 

The DB960C ADIC makes use of the PCLK 
outputs, DO through D7, and some of the 
address and control signals of the processor. 
The following table lists the worst case 
loadings added by the presence of the 
DB960CADIC circuitry. 



Ordering Information 

DB960CADIC In-circuit debug monitor for 
the i960CA embedded 
microprocessor. Operates at 
speeds up to 33 MHz. Includes 
hardware debug module, 
RS232/RS422 serial cables, 
DOS host software, and 
documentation. 

DB960CASAST Stand-Alone Self Test Unit for 
DB960CADIC. Includes built- 
in power supply, self-test 
board, 4Mbyte of usable 
DRAM for code development, 
and enclosure. 

DB960CAST DB960CADIC and 

DB960CASAST as described 
above. 

LAI960CA Optional Logic Analyzer 

Interface Board for the 
i960CA system. Does not 
require DB960CADIC. 



Signal 


DC Load 


Capacitive 


Name 


(juA) 


Load(pF) 


PCLK1 


+ 25/ -250 


8 


PCLK2 


+ 30/ -255 


17 


CLKIN 


+ 12/ -12 


13 


D0:D7 


+ 20/ -600 


10 


A31:26 


+ 25/ -250 


11 


A2:A17 


+ 20/ -100 


10 


BE0*,BE1* 


+ 20/ -100 


10 


ADS* 


+ 50/ -500 


13 


W/R* 


+ 50/ -500 


13 


WAIT* 


+ 25/ -250 


8 


BLAST* 


+ 25/ -250 


8 


FAIL* 


+ 20/ -20 


8 


RESET* 


+ 15/ -15 


25 


INT0:7* 


+ 20/ -500 


15 


NMI* 


+ 20/ -500 


15 



Additional Loading Imposed on the Target by 
the DB960CADIC 
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INTEL DEVELOPMENT TOOLS SOFTWARE 

SERVICES 





Intel is committed to providing high quality products and customer support. Our 
commitment to quality is demonstrated by a 30 day, money-back, unconditional refund to 
customers not satisfied with their purchase of an Intel Development Tools product. 

Intel supports its customers by offering a 90-day software warranty and standard 
software support including free technical support over the phone. 

Intel software is continuously undergoing improvement. For customers who desire the 
security of having the most current software and the convenience of having updates sent 
automatically, Intel offers inexpensive Software Support Contracts. 



SOFTWARE WARRANTY 

The standard software warranty is 90 days 
and entitles the customer to the following 
(provided the customer has registered their 
software by returning a completed 
Warranty Registration Card): 

• Replacement of defective media 

• Software product updates occurring 
within the 90 day warranty period. 



STANDARD SOFTWARE 
SUPPORT 

Standard Software Support, provided at no 
additional cost, offers the following 
additional benefits: 

• Free Technical Information Phone 
Service ("TIPS') 

• Timely response to Software Problem 
Reports 
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Software Support Contracts 

Software Support Contracts cover products for 
one year from the date of purchase and are 
renewable annually. The following benefits are 
provided: 

• Automatic Software Updates 

• Standard Software Support 



• Remote Diagnostic Software for DOS-based 
products. 

• Monthly issues of iCOMMENTS, a technical 
support publication 

• Quarterly issues of Troubleshooting Guides 
(host-specific) 

• Quantity discounts 



ORDERING INFORMATION 



Ordering Procedures 

For more information, call 1-800-468-3548 or 
your local Intel sales office. Similar support 
offerings are available outside of North 
America. Software Support Contracts are 
available for North American customers only. 

All orders for contracts, including renewals, 
can be submitted through the local Intel sales 
office or directly to the Development Tools 
Operation by calling 1-800-874-6835. 

To order a Software Support Contract, a 
customer must have registered their product 
or provide proof of ownership. Customers must 
also have the most current version of the 
software, otherwise, they must order a product 
upgrade before a support contract may be 
purchased. 

Pricing is a percentage of the List Price, based 
on the number of copies covered by the 
Software Support Contract. For emulators, the 
percentage will be applied to the identified list 
price of the software portion only, not the full 
list price of the emulator. 



Pricing Information 

Quantity discounts are: 

product quantity pricing per copy 

1-10 copies 20% of List Price 

11-25 copies 15% of List Price 

26+ copies 10% of List Price 

VAX and Micro VAX software not included. 
Please call 1-800-874-6835 for price quote. 

Ordering Information 

order code description 

SWSUPPORT51 Software Support Contract 
for 51 family 

SWSUPPORT86 Software Support Contract 
for 86 family 

SWSUPPORT96 Software Support Contract 
for 96 family 

SWSUPPORT286 Software Support Contract 
for 286 family 

SWSUPPORT386 Software Support Contract 
for 386 family 

SWSUPPORT486 Software Support Contract 
for 486 family 

SWSUPPORT960 Software Support Contract 
for 960 family 



5-42 




iRMKTM 960 
REAL-TIME KERNEL 

S 32-Bit Real-Time Multi-Tasking Kernel m Requires Only an i960 KA, KB or MC 

for the i960™ Microprocessor Family Embedded Processor 

m Flexible, Modular Design to Ease Q Bus Independent 

System Integration Q Easy customization and Add-On 
m Fast Execution with Predictable Enhancements 

Response Time for Time-Critical Easi|y EPR0 Mmable 

Applications 

~ x ^ .. ~. t + m .,.. x E Comprehensive Development Tool 

m Compact Code Size (14 Kbytes— Support 

Including All Optional Modules) 

The iRMK 960 Real-Time Kernel is the 32-bit real-time executive developed and supported by Intel, the i960 
architecture experts. The kernel is a small, fast and highly modular package of system control software. It 
contains the basic software building blocks that act as the foundation in using the key features of the i960 
microprocessor. The iRMK 960 software is fully supported by an array of tools that work in the most popular 
development environments (i.e., DOS*, VAX/VMS*, SUN*). 

The iRMK 960 Real-Time Kernel is available off-the-shelf. The kernel reduces the cost and risk of designing 
and maintaining software for numerous real-time applications such as, embedded control systems and dedi- 
cated real-time subsystems in multiprocessor environment. Use of the kernel can save man years that might 
otherwise be spent developing or porting another real-time kernel. This means reduced time to market for the 
user. 




*DOS® is a registered trademark of Microsoft Corporation. 
VAX/VMStm is a trademark of Digital Equipment Corporation. 
SUNtm is a trademark of Sun Microsystems. 
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ARCHITECTURAL OVERVIEW 

At the heart of the architecture are the kernel core 
modules consisting of a scheduler, task manager, 
interrupt manager and time manager (See Figure 1). 
As additional building blocks, the kernel provides op- 
tional modules consisting of a mailbox manager, 
semaphore manager, memory manager, on-proces- 
sor interrupt controller manager and fault handler 
manager. The optional device manager for the 
82380 Integrated System Peripheral (ISP) and 8254 
Programmable Interval Timer (PIT) complete the ar- 
chitecture. 



FUNCTIONAL FEATURES 



A Full Set of Real-Time Building Blocks 

The kernel provides a full set of services for real- 
time applications including task management, time 
management, synchronization of and communica- 
tions between tasks, and memory pool manage- 
ment. 



TASK MANAGEMENT 

The iRMK 960 kernel uses system calls to create, 
manage and schedule tasks in a multi-tasking envi- 
ronment. It provides pre-emptive priority scheduling 
combined with optional time-slice (round robin) 
scheduling. 

The scheduling algorithm used by the kernel en- 
ables tasks to be rescheduled in a fixed amount of 
time regardless of the number of tasks. Applications 
may contain any number of tasks. 

An application can integrate optional task handlers 
to customize task management. These handlers can 
execute on task creation, task switch, task deletion 
and task priority change. Task handlers can be used 
for a wide range of functions, including saving and 
restoring the state of coprocessor registers on task 
switch, masking interrupts based on task priority or 
implementing statistical and diagnostic monitors. 



INTERRUPT MANAGEMENT 

iRMK 960 interrupts are managed by immediately 
switching control to user-written interrupt handlers 
when an interrupt occurs. 
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Figure 1. iRMKTM 960 Real-Time Architecture 
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Response to interrupts is both fast and predictable. 
Most of the kernel's system calls can be executed 
directly from interrupt handlers. 

TIME MANAGEMENT 

The time management features included in the ker- 
nel provide single-shot alarms, repetitive alarms and 
a real-time clock. In addition, alarms can be reset. 

These time management facilities can solve a wide 
range of real-time programming problems. Single- 
shot alarms, for example, can be used to handle 
timeouts. If the timeout occurs, the alarm invokes a 
user-written handler; if the event occurs before the 
timeout, the application simply deletes the alarm. 
Other uses for the kernel's time management facili- 
ties including polling devices with repetitive alarms, 
putting tasks to sleep for specified periods of time, 
or implementing a time-of-day clock. 

INTERTASK SYNCHRONIZATION AND 
COMMUNICATION 

Semaphores, regions and mailboxes are the key 
mechanisms the kernel uses for synchronizing tasks 
and communicating between tasks. 

Semaphores are objects used for intertask signaling 
and synchronization. Tasks exchange abstract 
"units" with semaphores as a means of becoming 
synchronized. A task requests a unit from a sema- 
phore to gain access to a resource. If the resource is 
available, the semaphore will have a unit to give to 
the task, enabling the task to proceed. A task sends 
a unit to a semaphore to indicate that it has released 
a previously obtained resource. 

A special binary type of semaphore is called a Re- 
gion. Regions are used to ensure mutual exclusion, 
thus preventing deadlock when tasks contend for 
control of system resources. A task holding a re- 
gion's unit runs at the priority of the highest priority 
task waiting in queue for the region's unit. 

Mailboxes are queues that can hold any number of 
messages and are used to exchange data between 
tasks. Either data or pointers can be sent using mail- 
boxes. The kernel allows mailbox messages to be of 
any length. High priority messages can be placed 
(jammed) at the front of the message queue to en- 
sure that they are received and processed before 
other messages queued at the mailbox. 

To ensure that high priority tasks are not blocked by 
lower priority tasks, the kernel allows tasks to queue 
at semaphores and mailboxes in priority order. The 
kernel also supports first-in, first-out task queueing. 



MEMORY POOL MANAGEMENT 

The iRMK 960 kernel uses the concept of memory 
pools to efficiently divide and manage blocks of 
memory. The memory pool manager provides for 
both fixed and variable block allocation. 

Memory can be divided into any number of pools. 
Multiple memory pools might be created for different 
speed memories, or for allocating different size 
blocks. The times to allocate and de-allocate fixed- 
size areas from within a pool have a fixed upper 
bound. 

The kernel-supplied memory manager works with 
flat memory architecture. Users can also write their 
own memory manager to provide different memory 
management policies or support virtual memory. 



Hardware Requirements and Support 

The kernel requires only an i960 microprocessor and 
sufficient memory for itself and its application. The 
kernel's design, however, recognizes that many sys- 
tems use additional programmable peripheral devic- 
es and coprocessors. The kernel provides optional 
device managers for: 

o The 82380 Integrated System Peripheral (ISP) 
chip 

o The 8254 Programmable Interval Timer (PIT) chip 

An application can supply managers for other devic- 
es and coprocessors in addition to or in replacement 
of the devices listed above. 

The openness of the iRMK 960 kernel is a major 
benefit to the OEM. The kernel is designed to be 
programmed into PROM or EPROM, making it easy 
to use in embedded designs. In addition, it can be 
used with any system bus, including those of MULTI- 
BUS I and MULTIBUS II bus architectures. 



A Modular Architecture for Easy 
Customization 

The kernel is designed for maximum flexibility. It can 
be customized for any application. Each major func- 
tion, mailboxes for example, is implemented as a 
separate module. The kernel's modules have not 
been linked together and are supplied individually. 
(See Table 1 for the list of kernel modules, and their 
approximate sizes.) 

The user links only the modules needed for his appli- 
cation. Any module not used does not need to be 
linked in, and does not increase the size of the ker- 
nel in your application. The user can also replace 
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any optional kernel module with one that imple- 
ments specific features required by the application. 
For example, the user might want to replace the ker- 
nel's memory manager with one that supports virtual 
memory. 



Table 1. iRMKTM 960 Kernel Modules 
and Approximate Sizes 



Core Modules 

Task Manager 
Interrupt Manager 
Time Manager 
Scheduler 
Initialization 

Optional Modules 

Mailbox Manager 
Semaphore Manager 
Memory Manger 
Fault Handler Manager 
Miscellaneous 

Optional Device Manager 

82380 Integrated System Peripheral 
8254 Programmable Interval Timer 



Bytes 

2600 

150 

3000 

1700 

50 



1250 

2900 

1260 

50 

300 



4200 
1200 



Total size of the (entire) kernel (minus device man- 
agers) is about 13.5 Kbytes. 



These tools include: 

Software: ASM 960 assembler iC 960 

compiler 

NOTE: 
These tools are available for DOS, 
VAX/VMS*, MicroVAX*, SUN* and 
EVA960KB 4MB environment 



Debuggers: 
ICEtm 960 

SMDTM960 

Evaluation 

Vehicles: 
EVA960KB 
A960KB4MB 



In-Circuit Emulator for the i960 mi- 
croprocessor 

System Debug Monitor for the i960 
microprocessor 



AT Bus-Compatible Board 
AT Bus-Compatible Board with 
4 Mbytes of Memory 
QT960 Standalone Evaluation Vehicle 



Intel Support, Consulting and Training 

With iRMK 960 kernel software, the developer has 
available the total Intel i960 architecture and real- 
time expertise of Intel's support engineers. Intel pro- 
vides telephone support, on or off-site consulting, 
troubleshooting guides and updates. The kernel in- 
cludes 90 days of Intel's Technical Information 
Phone Service (TIPS). Extended support and con- 
sulting are also available. 



Developing with the iRMK™ 960 
Real-Time Kernel 

Kernel applications can be written using any lan- 
guage or compiler that produces code that executes 
on the i960 microprocessor. This independence is 
achieved by using an interface library. This library 
works with the idiosyncracies of a particular lan- 
guage—for example, the ordering of parameters. 
The interface library translates the calls provided by 
the language into a standard format expected by the 
kernel. Intel provides an interface library for our iC 
960 compiler. The source code of this library is in- 
cluded, so that the user can modify it to support oth- 
er compliers. 

Because the kernelis supplied as unlinked object 
modules, applications can be developed on any sys- 
tem that hosts the development tools needed. 



Comprehensive Development Tool 
Support 

Intel provides a complete line of 80960 development 
tools for writing and debugging iRMK 960 applica- 
tions. 



Contents of the iRMKTM 960 Kernel 
Development Package 

The iRMK 960 Kernel comes in a comprehensive 
package including: 

• Kernel object modules 

• Source for the kernel supplied 82380 Integrated 
System Peripheral and 8254 PIT device manag- 
ers 

• Source for the iC 960 interface library 

• Source for sample applications showing the fol- 
lowing: 

— Structure of kernel applications 

— - Use'of the kernel with an application written in iC 
960 language 

— Compile, bind and build sequences 

— Sample initialization code for the i960 microproc- 
essor 

— Applications written to execute in a flat memory 
space 

• User reference guide 

• 90 days of customer support 
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LICENSING 

iRMK 960 software requires prior execution of the 
standard Intel Software License Agreement (SLA). A 
single development copy requires a Class I license 
and allows iRMK 960 software to be loaded and run 
on one single-processor system. 



SPECIFICATIONS 



System Calls 

The following items are system calls arranged by 
type: 



iRMKTM 960 KERNEL SYSTEM CALLS LISTING 
KERNEL INITIALIZATION 



KN initialize 



Initialize kernel 



OBJECT MANAGEMENT 

KN token to ptr 



KN current task 



TASK MANAGEMENT 

KN create task 

KN_delete__task 

KN suspend task 

KN resume task 

KN set priority 

KN get priority 



Returns a pointer to the 
area holding object 

Returns a pointer for the 
current task 



Creates a task 
Deletes a task 
Suspends a task 
Resumes a task 
Change priority of a task 
Return priority of a task 



INTERRUPT MANAGEMENT 



KN set interrupt 

KN stop scheduling 

KN start_scheduling 

TIME MANAGEMENT 

KN sleep 

KN create alarm 

KN reset alarm 

KN delete alarm 



Specify interrupt handler 
Suspend task switching 
Resume task switching 



Put calling task to sleep 

Create and start virtual 
alarm clock 

Reset an existing alarm 

Delete alarm 



KN get time 

KN set time 

KN_tick 



Get time 

Set time 

Notify kernel that clock 
tick has occurred 



INTERTASK COMMUNICATION AND 
SYNCHRONIZATION 



KN create semaphore 

KN delete semaphore 

KN send unit 

KN receive unit 

KN create_mailbox 

KN delete mailbox 

KN send data 

KN_send__priority_data 

KN receive data 

MEMORY MANAGEMENT 



Create a semaphore 

Delete a semaphore 

Add a unit to a 
semaphore 

Receive a unit from a 
semaphore 

Create a mailbox 

Delete a mailbox 

Send data to a mailbox 

Place flam) priority 
message at head of 
message queue 

Request a message 
from a mailbox 



KN___create_pool 
KN__delete__pool 
KN create area 

KN__delete area 



Create a memory pool 

Delete a memory pool 

Create a memory area 
from a pool 

Return a memory area to 
a memory pool 

KN_get___pool attributes Get a memory pool's 

attributes 



PROGRAMMABLE INTERRUPT 
CONTROLLER MANAGEMENT 




KN_Jnitialize_PICs 
KN mask slot 

KN unmask slot 

KN__send__EOI 

KN__new masks 

KN_get__slot 

KN_get_interrupt 



Initialize PIC's 

Mask out interrupts on a 
specified slot 

Unmask interrupts on a 
specified slot 

Signal the PIC that the 
interrupt on a specified 
slot has been serviced 

Change interrupt masks 

Return the most 
important active interrupt 
slot 

Get address of specified 
interrupt handler 
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PROGRAMMABLE INTERVAL CONTROLLER 


Action 


Time (in jus) 


MANAGEMENT 


Create Semaphore 


6 




Delete Semaphore 


14 


KN_initialize_PIT Initialize the PIT 


FIFO Semaphore Send Unit 


7 


KN_start_PIT Start PIT counting 


FIFO Semaphore Receive Unit 


7 


KN_get_PIT_jnterval Return PIT interval 


Region Semaphore Send Unit 


18 




Region Semaphore Receive Unit 


14 


PROCESSOR RECOGNIZED FAULT HANDLING 








Create Mailbox 


19 


KN geLJault handler Get address of fault 


Delete Mailbox 


23 


handler currently 


' Send Data 


21 


associated with 


Receive Data 


21 


specified fault type 






KN_set_Jault__handler Establish address of 


Create Alarm 


29 


fault handler for the 


Delete Alarm 


30 


specified fault type 


FIFO Semaphore Send/Receive 






Unit with Task Switch 


75 


PROCESSOR INTERRUPT 


Suspend Task with Task Switch 


70 


CONTROLLER SUPPORT 


Basic Task Switch 


50 




Create Task 


62 


KN get__processor__ Returns value of the 

priority processor 


Suspend Task 
Resume Task 


26 
50 


KN__set_processor_ Change the value of the 


Delete Task 


50 


priority processor priority 


Get Priority 


5 




Set Priority 


27 


PERFORMANCE 






, .(....'• 


Set Interrupt 


3 


The figures listed below were derived from a test 


Get Interrupt 


3 



an 80960KB running at 20 MHz. The EVA-960 has 
what is known as 2-1-1-1 wait state memory; what 
this means is that the first instruction of a four in- 
struction fetch takes two wait states, and each of the 
three successive instructions takes one wait state. 
The figures are the worst case values obtained from 
several sets of test runs. The code was generated 
using the iC 960 DOS hosted compiler, Version 1.1. 



Action 

Create Pool 
Get Pool Attributes 
Delete Pool 
Create Area 
Delete Area 



Time (in jms) 

18 
36 
1 
35 
32 



MANUALS 

iRMK 960 User's Manual (Intel Order #463863- 
001). 



TRAINING INFORMATION 

Intel Customer Service Training: 

"80960 KA/KB Embedded. Processor Training 
Course" 

ORDERING INFORMATION 



Ordering Code 

RMK960 



Product Description 

iRMK 960 Real-Time Kernel 
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270870-1 

Low Cost Processor Evaluation Tool 

Intel's EV80960CA evaluation board provides a low-cost hardware environment for code 
execution and software debugging. The board features the 80960CA, the newest and 
highest performance member of Intel's family of 32-bit embedded microprocessors. The 
board allows a user's program to take full advantage of the power of the 80960CA and 
provides zero wait state execution of the user's code. 

Popular features such as single line assembler/ disassembler, single-step program 
execution and software breakpoints are standard on the EV80960CA's on-board monitor. 
Available separately, Intel offers a complete code development environment using the 
assembler (ASM-960) as well as high-level languages, such as Intel's iC-960 C compiler, to 
accelerate development schedules. 

The EV80960CA evaluation board package features the 80960CA System Debug Monitor 
(SDM) in EPROM, a SDM host software floppy disk, a power supply cable, a 9-pin PC/AT 
serial connector for terminal and the EV80960CA User's Manual. The EV80960CA 
User's Manual includes schematics of the board, a part list and programmable logic 
(PLD) equations. The board is hosted on an IBM or BIOS-compatible PC/AT. 



*The SRAM memory system provides zero wait state read (0-0-0-0-0) and one wait state write (l-l-i-1-0) performance. 
'*The DRAM memory system provides 2-1-1-1-1 reads and writes. 
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EV80960CA Features 

• 25 MHz Execution Speed 

• 32 Kbytes of EPROM for 80960CA SDM 
Target Operating Firmware 

• 64 Kbytes of Zero Wait State Pipelined 
SRAM* 

• 1 Mbyte of Static-Column Mode DRAM** 
expandable to 4 Mbytes 

• Concurrent Interrogation of Memory and 
Registers 

• Software Breakpoints 

• Code Disassembly 

• High-Level Language Support 

• Two RS-232s for Host and User 
Communication 

• Two iSBX I/O Connectors 

• An Expansion Bus to Accommodate 
Eurocard Form-Factor Prototyping Boards 

Fast Pipelined SRAM Memory 
System 

The pipelined-read memory system of the 
EV80960CA provides true zero wait state read 
and one wait state performance. The memory 
design utilizes the internal wait state 
generator of the 80960CA. 



Fast Static-Column Mode DRAM 

The memory design of the EV80960CA uses 
the 80960CA burst mode bus and static-column 
DRAM mode. The DRAM control PLDs are 
functionally isolated into interconnected state 
machines. The PLDs can be changed to allow 
alternative DRAM memory implementations 
with different DRAM access modes (static- 
column mode, nibble mode or fast-page mode). 

Concurrent Interrogation of 
Memory and Registers 

The 80960CA System Debug Monitor (SDM) for 
the EV80960CA allows the user to read and 
modify internal registers and external memory 
while the user's program is running on the 
board. 

iSBXI/O Connectors and 
Expansion Interface 

The EV80960CA evaluation board has two 
connectors to support both 8- and 16-bit 
standard iSBX Expansion Modules. The board 
also provides an expansion bus to 
accommodate Eurocard form-factor 
prototyping boards. 
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Communication Link 

The EV80960CA board communicates with the 
host through the RS-232 link using an Intel 
82510 UART provided on board. The board 
supports seven baud rates: 300, 1200, 2400, 
4800, 9600, 19200 and 38400. 

Power Requirements 

The EV80960CA Evaluation Board requires 5V 
at 2000 mA and ± 12V at 25 mA. 



Host System Requirements 

The EV80960CA Evaluation Board is hosted on 
an IBM PC/AT or compatibles; a 386-based PC 
is recommended. The host system must meet 
the following minimum requirements: 

• 512 Kbytes of Memory 

• One 1.2 Mbyte Floppy Disk Drive 

• PC-DOS 3.2 or Later 

o A Serial Port (COM1 or COM2) 
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i960TM SA/SB EVALUATION BOARD 

The EV80960SX board is a general purpose evaluation tool for the i960TM SA/SB 
embedded processors. This evaluation board provides a high-performance DRAM 
subsystem, an interleaved EPROM subsystem, and a robust set of peripheral devices for 
benchmarking and debugging application code written for the i960 SA/SB embedded 
processors. 

The EV80960SX is a great starter kit for your 32-bit application. The EV80960SX, 
NINDY debug environment, along with assembler and C-compiler (not provided) provide 
a seamless environment for developing code and evaluating the i960 SA/SB processors. 
The NINDY monitor provides code download capabilities from a number of popular 
development systems, including DOS-based PC's. Single step, breakpoints, register and 
memory display are among the full set of features provided by NINDY. 



The board is provided with the following 
features: 

• DRAM Subsystem operates at 
1-0-0-0-0-0-0-0 wait states for read and 
write cycles in the burst mode. The 
DRAM subsystem runs at the maximum 
processor frequency of 16 MHz, using 
100 ns fast page mode DRAMs. The 
DRAM subsystem can accommodate 
from 512 Kbytes to 4 Mbytes, using 4 or 8 
ZIP-packaged DRAMs. 

• Interleaved EPROM Subsystem executes 
burst program fetches with a 2-0-1-0-2-0- 
1-0 wait state performance. 



The EPROM subsystem accommodates 
four, 32-pin or 28-pin 8-bit wide EPROMs 
with up to 150 ns access times. 

• Flash EPROM Subsystem reads and 
writes two 8-bit wide Flash EPROMS. 

• 8259A Interrupt Controller provides 
expanded interrupt capabilities using 
the i960 SA/SB's interrupt controller 
interface. 

• Parallel Port Input allows fast 
downloads of code or data to the 
EV80960SX board. The parallel port 
provides auto-busy and interrupt 
capabilities, and is a full implementation 
of the Centronics standard. 



ACE51®, ICE® and MCS® are registered trademarks of Intel Corporation. 
Ethernet® is a registered trademark of Xerox Corporation. 
*CHMOS is a patented Intel process. 
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Two serial ports provide queued and 

interrupt driven serial transfer at up to 

128000 baud. 

82C54 Timer/Counter provides a 32-bit 

counter and 16-bit counter, each with 

dedicated interrupts. 

Expansion /Prototype Bus (XBUS) allows 

expansion cards and prototype hardware 

direct access to the i960 SA/SB's bus and 

control signals. Optionally, a configurable 

wait state scheme provides a no glue 

interface to most peripherals attached to the 

XBUS. 

LEDs and Switches are user programmable. 

One 10-segment bar LED, a 7-segment LED 

and an 8-position switch are under program 

control. 

Local Area Networking (LAN) is 

implemented using an 82596SX LAN 

coprocessor. 



o Laser Printer Control provides interfaces to 
TEC or Canon compatible laser engines. 

° Monitor and Self-test diagnostics are 
provided for the EV80960SX in the EPROMs 
installed in the board. 

The evaluation board comes complete with a 
design database included on diskette, the 
NINDY debug monitor on diskette and in 
EPROM, power and serial cables, schematics 
and user's manual. 

The EV80960SX is a public domain design. The 
hardware is fully documented and provides 
working examples of popular memory and 
peripheral interfaces to the i960 SA/SB 
processor. The schematic and PLD database 
are provided with each board. The EV80960SX 
designs are easily duplicated and can be used 
directly as the building blocks for custom 
designs. Custom hardware can be prototyped 
using the expansion bus (XBUS) connector. 




82598SX 

LAN 
Controller 



rsssH 



DRAM 



IC 



FLASH 



Interleaved 
EPROM 



82C54 
Timer/ 
Counter 



Buffers/ 
Expansion 
Connector 



LEDs & 

DIP 
Switches 




Dual 
RS-232 
Serial 



l-Q Host Port 
\ -[] User Port 



Centronics 

Parallel 
Port (INPUT) 




EV80960SX Evaluation Board 
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NORTH AMERICAN SALES OFFICES 



ALABAMA 

Intel Corp. 

5015 Bradford Dr., #2 * 
Huntsville 35805 
Tel: (205) 830-4010 
FAX: (205) 837-2640 

ARIZONA 

tlntel Corp. 
410 North 44th Street 
Suite 500 
Phoenix 85008 
Tel: (602) 231-0386 
FAX: (602) 244-0446 

CALIFORNIA 

tlntel Corp. 

21515 Vanowen Street 

Suite 116 

Canoga Park 91303 

Tel: (818) 704-8500 

FAX: (818)340-1144 

Intel Corp. 
1 Sierra Gate Plaza 
Suite 280C 
Roseville 95678 
Tel: (916) 782-8086 
FAX: (916) 782-8153 

tlntel Corp. 

9665 Chesapeake Dr. 

Suite 325 

San Diego 92123 

Tel: (619) 292-8086 

FAX: (619) 292-0628 

*tlntel Corp. 

400 N. Tustin Avenue 

Suite 450 

Santa Ana 92705 

Tel: (714) 835-9642 

TWX: 910-595-1114 

FAX: (714)541-9157 

*tlntel Corp. 

San Tomas 4 

2700 San Tomas Expressway 

2nd Floor 

Santa Clara 95051 

Tel: (408) 986-8086 

TWX: 910-338-0255 

FAX: (408) 727-2620 

COLORADO 

Intel Corp. 

4445 Northpark Drive 

Suite 100 

Colorado Springs 80907 

Tel: (719) 594-6622 

FAX: (303)594-0720 

•tlntel Corp. 
600 S. Cherry St. 
Suite 700 
Denver 80222 
Tel: (303)321-8086 
TWX: 910-931-2289 
FAX: (303) 322-8670 

CONNECTICUT 

tlntel Corp. 

301 Lee Farm Corporate Park 

83 Wooster Heights Rd. 

Danbury 06810 

Tel: (203) 748-3130 

FAX: (203) 794-0339 

FLORIDA 

tlntel Corp. 

800 Fairway Drive 

Suite 160 

Deerfield Beach 33441 

Tel: (305) 421-0506 

FAX: (305) 421-2444 



tlntel Corp. 
5850 T.G. Lee Blvd. 
Suite 340 
Orlando 32822 
Tel: (407) 240-8000 
FAX: (407) 240-8097 

GEORGIA 

tlntel Corp. 

20 Technology Parkway 

Suite 150 

Norcross 30092 

Tel: (404) 449-0541 

FAX: (404) 605-9762 

ILLINOIS 

*tlntel Corp. 

Woodfield Corp. Center III 
300 N. Martingale Road 
Suite 400 

Schaumburg 60173 
Tel: (708) 605-8031 
FAX: (708) 706-9762 

INDIANA 

tlntel Corp. 
8910 Purdue Road 
Suite 350 

Indianapolis 46268 
Tel: (317) 875-0623 
FAX: (317)875-8938 

MARYLAND 

*tlntel Corp. 
10010 Junction Dr. 
Suite 200 

Annapolis Junction 20701 
Tel: (301) 206-2860 
FAX: (301)206-3677 
(301) 206-3678 

MASSACHUSETTS 

*tlntel Corp. 
Westford Corp. Center , 
3 Carlisle Road 
2nd Floor 
Westford 01886 
Tel: (508) 692-0960 
TWX: 710-343-6333 
FAX: (508) 692-7867 

MICHIGAN 

tlntel Corp. 

7071 Orchard Lake Road 

Suite 100 

West Bloomfield 48322 

Tel: (313) 851-8096 

FAX: (313) 851-8770 

MINNESOTA 

tlntel Corp. 
3500 W. 80th St. 
Suite 360 

Bloomington 55431 
Tel: (612) 835-6722 
TWX: 910-576-2867 
FAX: (612) 831-6497 

NEW JERSEY 

*tlntel Corp. 
Lincroft Office Center 
125 Half Mile Road 
Red Bank 07701 
Tel: (908) 747-2233 
FAX: (908) 747-0983 

NEW YORK 

•IntelCorp. 

850 Crosskeys Office Park 

Fairport 14450 

Tel: (716) 425-2750 

TWX: 510-253-7391 

FAX: (716) 223-2561 



*tlntel Corp. 

2950 Express Dr., South 

Suite 130 

Islandia 11722 

Tel: (516)231-3300 

TWX: 510-227-6236 

FAX: (516)348-7939 

tlntel Corp. 

300 Westage Business Center 

Suite 230 

Fishkill 12524 

Tel: (914) 897-3860 

FAX: (914)897-3125 

OHIO 

*tlntel Corp. 

3401 Park Center Drive 

Suite 220 

Dayton 45414 

Tel: (513) 890-5350 

TWX: 810-450-2528 

FAX: (513) 890-8658 

*tlntel Corp. 
25700 Science Park Dr. 
Suite 100 
Beachwood 44122 
Tel: (216) 464-2736 
TWX: 810-427-9298 
FAX: (804) 282-0673 

OKLAHOMA 

Intel Corp. 
6801 N. Broadway 
Suite 115 

Oklahoma City 73162 
Tel: (405) 848-8086 
FAX: (405) 840-9819 

OREGON 

tlntel Corp. 

15254 N.W. Greenbrier Pkwy. 

Building B 

Beaverton 97006 

Tel: (503) 645-8051 

TWX: 910-467-8741 

FAX: (503) 645-8181 

PENNSYLVANIA 

*tlntel Corp. 
925 Harvest Drive 
Suite 200 
Blue Bell 19422 
Tel: (215) 641-1000 
FAX: (215)641-0785 

*tlntel Corp. 
400 Penn Center Blvd. 
Suite 610 
Pittsburgh 15235 
Tel: (412) 823-4970 
FAX: (412) 829-7578 

PUERTO RICO 

tlntel Corp. 
South Industrial Park 
P.O. Box 910 
Las Piedras 00671 
Tel: (809)733-8616 

TEXAS 

tlntel Corp. 

891 1 N. Capital of Texas Hwy. 

Suite 4230 

Austin 78759 

Tel: (512)794-8086 

FAX: (512) 338-9335 

*tlntel Corp. 
12000 Ford Road 
Suite 400 
Dallas 75234 
Tel: (214)241-8087 
FAX: (214)484-1180 



"tlntel Corp. 
7322 S.W. Freeway 
, Suite 1490. 
Houston 77074 
Tel: (713) 988-8086 
TWX: 910-881-2490 
FAX: (713) 988-3660 

UTAH 

tlntel Corp. 
428 East 6400 South 
Suite 104 
Murray 84107 
Tel: (801) 263-8051 
FAX: (801) 268-1457 

WASHINGTON 

tlntel Corp. 

155 108th Avenue N.E. 

Suite 386 

Bellevue 98004 

Tel: (206) 453-8086 

TWX: 910-443-3002 

FAX: (206) 451-9556 

Intel Corp. 
408 N. Mullan Road 
Suite 102 
Spokane 99206 
Tel: (509) 928-8086 
FAX: (509) 928-9467 

WISCONSIN 

Intel Corp. 
330 S. Executive Dr. 
Suite 102 
Brookfield 53005 
Tel: (414) 784-8087 
FAX: (414)796-2115 



CANADA 



BRITISH COLUMBIA 

Intel Semiconductor of 
Canada, Ltd. 
4585 Canada Way 
Suite 202 
Burnaby V5G 4L6 
Tel: (604) 298-0387 
FAX: (604) 298-8234 



ONTARIO 

tlntel Semiconductor of 

Canada, Ltd. 

2650 Queensview Drive 

Suite 250 

Ottawa K2B 8H6 

Tel: (613) 829-9714 

FAX: (613) 820-5936 

tlntel Semiconductor of 
Canada, Ltd. 
190 Attwell Drive 
Suite 500 
Rexdale M9W 6H8 
Tel: (416)675-2105 
FAX: (416) 675-2438 



QUEBEC 

tlntel Semiconductor of 
Canada, Ltd. 
1 Rue Holiday 
Suite 115 
Tour East 
Pt. Claire H9R 5N3 
Tel: (514)694-9130 
FAX: 514-694-0064 



tSales and Service Office 
•Field Application Location 
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NORTH AMERICAN DISTRIBUTORS 



ALABAMA 

Arrow Electronics, Inc. 
1015 Henderson Road 
Huntsville 35806 
Tel: (205) 837-6955 
FAX: (205) 721-1581 

Hamilton/Avnet Electronics 
4960 Corporate Drive, #135 
Huntsville 35805 
Tel: (205) 837-7210 
FAX: (205) 721-0356 

MTI Systems Sales 
4950 Corporate Drive 
Suite 120 
Huntsville 35805 
Tel: (205) 830-9526 
FAX: (205) 830-9557 

Pioneer/Technologies Group, Inc. 
4835 University Square, #5 
Huntsville 35805 
Tel: (205) 837-9300 
FAX: (205) 837-9358 

ARIZONA 

tArrow Electronics, Inc. 
4134 E.Wood Street 
Phoenix 85040 
Tel: (602) 437-0750 
FAX: (602) 252-9109 

Avnet Computer 

30 South McKemy Avenue 

Chandler 85226 

Tel: (602) 961-6460 

FAX: (602)961-4787 

Hamilton/Avnet Electronics 
30 South McKemy Avenue 
Chandler 85226 
Tel: (602) 961-6403 
FAX: (602) 961-1331 

Wyle Distribution Group 
4141 E. Raymond 
Phoenix 85040 
Tel: (602) 437-2088 
FAX: (602)437-2124 

CALIFORNIA 

Arrow Commercial System Group 
1502 Crocker Avenue 
Hay ward 94544 
Tel: (415)489-5371 
FAX: (415) 489-9393 

Arrow Commercial System Group 

14242 Chambers Road 

Tustin 92680 

Tel: (714) 544-0200 

FAX: (714) 731-8438 

tArrow Electronics, Inc. 
19748 Dearborn Street 
Chatsworth 91311 
Tel: (818) 701-7500 
FAX: (818) 772-8930 

tArrow Electronics, Inc. 
9511 Ridgehaven Court 
San Diego 92123 
Tel: (619) 565-4800 
FAX: (619) 279-8062 

tArrow Electronics, Inc. 
1180 Murphy Avenue 
San Jose 95131 
Tel: (408) 441-9700 
FAX: (408) 453-4810 

tArrow Electronics, Inc. 
2961 Dow Avenue 
Tustin 92680 
Tel: (714) 838-5422 
FAX: (714) 838-4151 

Avnet Computer 
3170 Pullman Street 
Costa Mesa 92626 
Tel: (714) 641-4121 
FAX: (714) 641-4170 

Avnet Computer 
1361B West 190th Street 
Gardena 90248 
Tel: (800) 345-3870 
FAX: (213) 327-5389 



Avnet Computer 
755 Sunrise Blvd., #150 
Roseville 95661 
Tel: (916) 781-2521 
FAX: (916) 781-3819 

Avnet Computer 
1175 Bordeaux Drive, #A 
Sunnyvale 94089 
Tel: (408) 743-3304 
FAX: (408) 743-3348 

Avnet Computer 
21 150 Califa Street 
Woodland Hills 91376 
Tel: (808) 345-3870 
FAX: (818)594-8333 

tHamilton/Avnet Electronics 
3170 Pullman Street 
Costa Mesa 92626 
Tel: (714)641-4100 
FAX: (714) 754-6033 

tHamilton/Avnet Electronics 
1175 Bordeaux Drive, #A . 
Sunnyvale 94089 
Tel: (408) 743-3300 
FAX: (408) 745-6679 

tHamilton/Avnet Electronics 
4545 Viewridge Avenue 
San Diego 92123 
Tel: (619) 571-1900 
FAX: (619) 571-8761 

tHamilton/Avnet Electronics 
21 150 Califa St. 
Woodland Hills 91367 
Tel: (818) 594-0403 
FAX: (818) 594-8234 

tHamilton/Avnet Electronics 
1361 B West 190th Street 
Gardena 90248 
Tel: (213) 516-8600 
FAX: (213) 217-6822 

tHamilton/Avnet Electronics 
755 Sunrise Avenue, #150 
Roseville 95661 
Tel: (916) 925-2216 
FAX: (916) 925-3478 

Pioneer/Technologies Group, Inc. 
134 Rio Robles 
San Jose 95134 
Tel: (408) 954-9100 
FAX: 408-954-9113 

tWyle Distribution Group 
124 Maryland Street 
El Segundo 90245 
Tel: (213) 322-8100 
FAX: (213)416-1151 

Wyle Distribution Group 
7431 Chapman Ave. 
Garden Grove 92641 
Tel: (714) 891-1717 
FAX: (714) 891-1621 

tWyle Distribution Group 
2951 Sunrise Blvd., Suite 175 
Rancho Cordova 95742 
Tel: (916) 638-5282 
FAX: (916)638-1491 

tWyle Distribution Group 
9525 Chesapeake Drive 
San Diego 92123 
Tel: (619) 565-9171 
FAX: (619)365-0512 

tWyle Distribution Group 
3000 Bowers Avenue 
Santa Clara 95051 
Tel: (408) 727-2500 
FAX: (408) 727-5896 

tWyle Distribution Group 
17872 Cowan Avenue 
Irvine 92714 
Tel: (714) 863-9953 
FAX: (714) 263-0473 

tWyle Distribution Group 
26010 Mureau Road, #150 
Calabasas 91302 
Tel: (818) 880-9000 
FAX: (818) 880-5510 



COLORADO 

Arrow Electronics, Inc. 
3254 C Frazer Street 
Aurora 8001 1 
Tel: (303) 373-5616 
FAX: (303) 373-5760 

tHamilton/Avnet Electronics 
9605 Maroon Circle, #200 
Englewood 80112 
Tel: (303) 799-7800 
FAX: (303) 799-7801 

tWyle Distribution Group 
451 E. 124th Avenue 
Thornton 80241 
Tel: (303) 457-9953 
FAX: (303) 457-4831 

CONNECTICUT 

tArrow Electronics, Inc. 
12 Beaumont Road 
Wallingford 06492 
Tel: (203) 265-7741 
FAX: (203) 265-7988 

Avnet Computer 
55 Federal Road, #103 
Danbury 06810 
Tel: (203) 797-2880 
FAX: (203) 791-9050 

tHamilton/Avnet Electronics 
55 Federal Road, #103 
Danbury 06810 
Tel: (203) 743-6077 
FAX: (203) 791-9050 

tPioneer/Standard Electronics 
112 Main Street 
Norwalk 06851 
Tel: (203) 853-1515 
FAX: (203) 838-9901 

FLORIDA 

tArrow Electronics, Inc. 
400 Fairway Drive, #102 
Deerfield Beach 33441 
Tel: (305) 429-8200 
FAX: (305) 428-3991 

tArrow Electronics,' Inc. 
37 Skyline Drive, #3101 
Lake Mary 32746 
Tel: (407) 333-9300 
FAX: (407) 333-9320 

Avnet Computer 
3343 W. Commercial Blvd. 
Bldg. C/D, Suite 107 
Ft. Lauderdale 33309 
Tel: (305) 979-9067 
FAX: (305) 730-0368 

Avnet Computer 
3247 Tech Drive North 
St. Petersburg 33716 
Tel: (813) 573-5524 
FAX: (813) 572-4324 

tHamilton/Avnet Electronics 
5371 N.W. 33rd Avenue 
Ft. Lauderdale 33309 
Tel: (305) 484-5016 
FAX: (305) 484-8369 

tHamilton/Avnet Electronics 
3247 Tech Drive North 
St. Petersburg 33716 
Tel: (813) 573-3930 
FAX: (813) 572-4329 

tHamilton/Avnet Electronics 
7079 University Boulevard 
Winter Park 32791 
Tel: (407) 657-3300 
FAX: (407) 678-1878 

tPioneer/Technologies Group, Inc. 
337 Northlake Blvd., Suite 1000 
Alta Monte Springs 32701 
Tel: (407) 834-9090 
FAX: (407) 834-0865 



Pioneer/Technologies Group, Inc. 
674 S. Military Trail 
Deerfield Beach 33442 
Tel: (305) 428-8877 
FAX: (305)481-2950 

GEORGIA 

Arrow Commercial System Group 

3400 C. Corporate Way 

Duluth 30136 

Tel: (404) 623-8825 

FAX: (404) 623-8802 

tArrow Electronic^, Inc. 

4250 E. Rivergreen Pkwy., #E 

Duluth 30136 

Tel: (404) 497-1300 

FAX: (404) 476-1493 

Avnet Computer 

3425 Corporate Way, #G 

Duluth 30136 

Tel: (404) 623-5452 

FAX: (404)476-0125 

Hamilton/Avnet Electronics 
3425 Corporate Way, #G 
Duluth 30136 
Tel: (404) 446-0611 
FAX: (404)446-1011 

Pioneer/Technologies Group, Inc. 

4250 C. Rivergreen Parkway 

Duluth 30136 

Tel: (404) 623-1003 

FAX: (404) 623-0665 



ILLINOIS 

tArrow Electronics, Inc. 
1140 W. Thorndale Rd. 
Itasca 60143 
Tel: (708) 250-0500 

Avnet Computer 
1 124 Thorndale Avenue 
Bensenville 60106 
Tel: (708) 860-8573 
FAX: (708) 773-7976 

tHamilton/Avnet Electronics 
1 130 Thorndale Avenue 
Bensenville 60106 
Tel: (708) 860-7700 
FAX: (708) 860-8530 

MTI Systems 

1140 W. Thorndale Avenue 

Itasca 60143 

Tel: (708) 250-8222 

FAX: (708) 250-8275 

tPioneer/Standard Electronics 
2171 Executive Dr., Suite 200 
Addison 60101 
Tel: (708) 495-9680 
FAX: (708) 495-9831 



INDIANA 

tArrow Electronics, Inc. 

7108 Lakeview Parkway West Dr. 

Indianapolis 46268 

Tel: (317) 299-2071 

FAX: (317) 299-2379 

Avnet Computer 
485 Gradle Drive 
Carmel 46032 
Tel: (317) 575-8029 
FAX: (317) 844-4964 

Hamilton/Avnet Electronics 
485 Gradle Drive 
Carmel 46032 
Tel: (317) 844-9333 
FAX: (317) 844-5921 

tPioneer/Standard Electronics 
9350 Priority Way West Dr. 
Indianapolis 46250 
Tel: (317) 573-0880 
FAX: (317)573-0979 
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NORTH AMERICAN DISTRIBUTORS (Contd.) 



IOWA 

Hamilton/Avnet Electronics 
2335A Blairsferry Rd., N.E. 
Cedar Rapids 52402 
Tel: (319) 362-4757 
FAX: (319)393-7050 

KANSAS 

Arrow Electronics, Inc. 
8208 Melrose Dr., Suite 210 
Lenexa 66214 
Tel: (913) 541-9542 
FAX: (913) 541-0328 

Avnet Computer 
15313 W. 95th Street 
Lenexa 61219 
Tel: (913)541-7989 
FAX: (913)541-7904 

tHamilton/Avnet Electronics 
15313 W. 95th 
Overland Park 66215 
Tel: (913) 888-1055 
FAX: (913) 541-7951 

KENTUCKY 

Hamilton/Avnet Electronics 
805 A. Newtown Circle 
Lexington 4051 1 
Tel: (606) 259-1475 
FAX: (606) 252-3238 

MARYLAND 

Arrow Commercial Systems Group 
200 Perry Parkway 
Gaithersburg 20877 
Tel: (301)670-1600 
FAX: (301)670-0188 

tArrow Electronics, Inc. 
8300 Guilford Road, #H 
Columbia 21046 , 
Tel: (301)995-6002 
FAX: (301)995-6201 

Avnet Computer 

7172 Columbia Gateway Dr., #G 

Columbia 21045 

Tel: (301) 995-0020 

FAX: (301)995-3515 

tHamilton/Avnet Electronics 
7172 Columbia Gateway Dr., #F 
Columbia 21045 
Tel: (301) 995-3554 
FAX: (301) 995-3515 

tNorth Atlantic Industries 
Systems Division 
7125 Riverwood Dr. 
Columbia 21046 
Tel: (301) 290-3999 

tPioneer/Technologies Group, Inc. 
15810 Gaither Road 
Gaithersburg 20877 
Tel: (301)921-0660 
FAX: (301) 670-6746 

MASSACHUSETTS 

Arrow Electronics, Inc. 
25 Upton Dr. 
Wilmington 01887 
Tel: (508) 658-0900 
FAX: (508)694-1754 

Avnet Computer 
10 D Centennial Drive 
Peabody 01960 
Tel: (508) 532-9886 
FAX: (508) 532-9660 

tHamilton/Avnet Electronics 
10D Centennial Drive 
Peabody 01960 
Tel: (508) 531-7430 
FAX: (508) 532-9802 

tPioneer/Standard Electronics 
44 Hartwell Avenue 
Lexington 02173 
Tel: (617) 861-9200 
FAX: (617)863-1547 

Wyle Distribution Group 
15 Third Avenue 
Burlington 01803 
Tel: (617) 272-7300 
FAX: (617) 272-6809 



MICHIGAN 

tArrow Electronics, Inc. 
19880 Haggerty Road 
Livonia 48152 
Tel: (313) 665-4100 
FAX: (313)462-2686 

Avnet Computer 

2876 28th Street, S.W., #5 

Grandville 49418 

Tel: (616) 531-9607 

FAX: (616) 531-0059 

Avnet Computer 
41650 Garden Road 
Novi 48375 
Tel: (313) 347-1820 
FAX: (313) 347-4067 

Hamilton/Avnet Electronics 
2876 28th Street, S.W., #5 
Grandville 49418 
Tel: (616) 243-8805 
FAX: (616)531-0059 

Hamilton/Avnet Electronics 

41650 Garden Brook Rd., #100 

Novi 48375 

Tel: (313) 347-4270 

FAX: (313) 347-4021 

tPioneer/Standard Electronics 
4505 Broadmoor S.E. 
Grand Rapids 49512 
Tel: (616) 698-1800 
FAX: (616)698-1831 

tPioneer/Standard Electronics 
13485 Stamford 
Livonia 48150 
Tel: (313) 525-1800 
FAX: (313) 427-3720 

MINNESOTA 

tArrow Electronics, Inc. 
10120A West 76th Street 
Eden Prairie 55344 
Tel: (612) 829-5588 
FAX: (612)942-7803 

Avnet Computer 
10000 West 76th Street 
Eden Prairie 55344 
Tel: (612) 829-0025 
FAX: (612) 944-2781 

tHamilton/Avnet Electronics 
12400 Whitewater Drive 
Minnetonka 55343 
Tel: (612)932-0600 
FAX: (612)932,0613 

tPioneer/Standard Electronics 
7625 Golden Triange Dr., #G 
Eden Prairie 55344 
Tel: (612) 944-3355 
FAX: (612) 944-3794 

MISSOURI 

tArrow Electronics, Inc. 
2380 Schuetz Road 
St. Louis 63141 
Tel: (314) 567-6888 
FAX: (314)567-1164 

Avnet Computer 
739 Goddard Avenue 
Chesterfield 63005 
Tel: (314) 537-2725 
FAX: (314) 537-4248 

tHamilton/Avnet Electronics 
741 Goddard 
Chesterfield 63005 
Tel: (314) 537-1600 
FAX: (314)537-4248 

NEW HAMPSHIRE 

Avnet Computer 
2 Executive Park Drive 
Bedford 03102 
Tel: (603) 624-6630 
FAX: (603) 624-2402 

NEW JERSEY 

tArrow Electronics, Inc. 
4 East Stow Road 
Unit 11 

Marlton 08053 
Tel: (609) 596-8000 
FAX: (609) 596-9632 



tArrow Electronics, Inc. 
6 Century Drive 
Parsipanny 07054 
Tel: (201) 538-0900 
FAX: (201)538-4962 

Avnet Computer 

1-B Keystone Ave., Bldg. 36 

Cherry Hill 08003 

Tel: (609) 424-8961 

FAX: (609)751-2502 

Avnet Computer 
10 Industrial Road 
Fairfield 07006 
Tel: (201) 882-2879 
FAX: (201) 808-9251 

tHamilton/Avnet Electronics 
1 Keystone Ave., Bldg. 36 
Cherry Hill 08003 
Tel: (609)424-0110 
FAX: (609)751-2552 

tHamilton/Avnet Electronics 
10 Industrial 
Fairfield 07006 
Tel: (201) 575-3390 
FAX: (201) 575-5839 

tMTI Systems Sales 
6 Century Drive 
Parsippany 07054 
Tel: (201) 539-6496 
FAX: (201) 539-6430 

tPioneer/Standard Electronics 
14-A Madison Rd. 
Fairfield 07006 
Tel: (201)575-3510 
FAX: (201) 575-3454 

NEW MEXICO 

Alliance Electronics Inc. 
10510 Research Avenue 
Albuquerque 87123 
Tel: (505) 292-3360 
FAX: (505) 275-6392 

Avnet Computer 
7801 Academy Road 
Bldg. 1, Suite 204 
Albuquerque 87109 
Tel: (505) 828-9725 
FAX: (505) 828-0360 

tHamilton/Avnet Electronics 
7801 Academy Rd. N.E. 
Bldg. 1, Suite 204 
Albuquerque 87108 
Tel: (505) 765-1500 
FAX: (505) 243-1395 

NEW YORK 

tArrow Electronics, Inc. 

3375 Brighton Henrietta Townline Rd. 

Rochester 14623 

Tel: (716) 427-0300 

FAX: (716) 427-0735 

Arrow Electronics, Inc. 
20 Oser Avenue 
Hauppauge 11788 
Tel: (516) 231-1000 
FAX: (516)231-1072 

Avnet Computer 
933 Motor Parkway 
Hauppauge 11788 
Tel: (516) 231-9040 
FAX: (516)434-7426 

Avnet Computer 
2060 Townline 
Rochester 14623 . 

Tel: (716)272-9306 
FAX: (716)272-9685 

tHamilton/Avnet Electronics 
933 Motor Parkway 
Hauppauge 11788 
Tel: (516) 231-9800 
• FAX: (516) 434-7426 

tHamilton/Avnet Electronics 
2060 Townline Rd. 
Rochester 14623 
Tel: (716) 292-0730 
FAX: (716) 292-0810 



Hamilton/Avnet Electronics 
103 Twin Oaks Drive 
Syracuse 13120 
Tel: (315) 437-2641 
FAX: (315) 432-0740 

MTI Systems 
50 Horseblock Road 
Brookhaven 11719 
Tel: (516) 924-9400 
FAX: (516)924-1103 

MTI Systems 
1 Penn Plaza 
250 W. 34th Street 
New York 10119 
Tel: (212) 643-1280 
FAX: (212) 643-1288 

Pioneer/Standard Electronics 
68 Corporate Drive 
Binghamton 13904 
Tel: (607) 722-9300 
FAX: (607) 722-9562 

tPioneer/Standard Electronics 
60 Crossway Park West 
Woodbury, Long Island 11797 
Tel: (516) 921-8700 
FAX: (516)921-2143 

tPioneer/Standard Electronics 
840 Fairport Park 
Fairport 14450 
Tel: (716) 381-7070 
FAX: (716)381-5955 

NORTH CAROLINA 

tArrow Electronics, Inc. 
5240 Greensdairy Road 
Raleigh 27604 
Tel: (919) 876-3132 
FAX: (919) 878-9517 

Avnet Computer 
2725 Millbrook Rd., #123 
Raleigh 27604 
Tel: (919) 790-1735 
FAX: (919) 872-4972 

Hamilton/Avnet Electronics 
5250-77 Center Dr. #350 
Charlotte 28217 
Tel: (704) 527-2485 
FAX: (704) 527-8058 

tHamilton/Avnet Electronics 
3510 Spring Forest Drive 
Raleigh 27604 
Tel: (919) 878-0819 

Pioneer/Technologies Group, Inc. 
9401 L-Southern Pine Blvd. 
Charlotte 28210 
Tel: (704) 527-8188 
FAX: (704) 522-8564 

Pioneer Technologies Group, Inc. 
2810 Meridian Parkway, #148 
Durham 27713 
Tel: (919) 544-5400 
FAX: (919) 544-5885 

OHIO 

Arrow Commercial System Group 

284 Cramer Creek Court 

Dublin 43017 

Tel: (614) 889-9347 

FAX: (614)889-9680 

tArrow Electronics, Inc. ■ 
6573 Cochran Road, #E 
Solon 44139 
Tel: (216) 248-3990 
FAX: (216) 248-1106 

Arrow Electronics, Inc. 
8200 Washington Village Dr. 
Centerville 45458 
Tel: (513) 435-5563 
FAX: (513) 435-2049 - 
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NORTH AMERICAN DISTRIBUTORS (Contd.) 



OHIO (Contd.) 

Avnet Computer 

7764 Washington Village Dr. 

Dayton 45459 

Tel: (513) 439-6756 

FAX: (513)439-6719 

Avnet Computer 

30325 Bainbridge Rd., Bldg. A 

Solon 44139 

Tel: (216) 349-2505 

FAX: (216)349-1894 

tHamilton/Avnet Electronics 
7760 Washington Village Dr. 
Dayton 45459 
Tel: (513) 439-6733 
FAX: (513)439-6711 

tHamilton/Avnet Electronics 
30325 Bainbridge 
Solon 44139 
Tel: (800) 543-2984 
FAX: (216) 349-1894 

Hamilton/Avnet Electronics 
2600 Corp Exchange Drive, #18' 
Columbus 43231 
Tel: (614) 882-7004 
FAX: (614) 882-8650 

MTI Systems Sales 

23404 Commerce Park Road 

Beachwood 44122 

Tel: (216) 464-6688 

FAX: (216) 464-3564 

tPioneer/Standard Electronics 
4433 Interpoint Boulevard 
Dayton 45424 
Tel: (513) 236-9900 
FAX: (513) 236-8133 

tPioneer/Standard Electronics 
4800 E. 131st Street 
Cleveland 44105 
Tel: (216) 587-3600 
FAX: (216) 663-1004 

OKLAHOMA 

Arrow Electronics, Inc. 

12111 East 51st Street, #101 

Tulsa 74146 

Tel: (918) 252-7537 

FAX: (918) 254-0917 

tHamilton/Avnet Electronics 

12121 E. 51st St., Suite 102A 

Tulsa 74146 

Tel: (918) 664-0444 

FAX: (918) 250-8763 

OREGON 

tAlmac Electronics Corp. 
1885 N.W. 169th Place 
Beaverton 97006 
Tel: (503) 629-8090 
FAX: 503-645-061 1 

Avnet Computer 

9409 Southwest Nimbus Ave. 

Beaverton 97005 

Tel: (503) 627-0900 

FAX: (503) 526-6242 

tHamilton/Avnet Electronics 
9409 S.W. Nimbus Ave. 
Beaverton 97005 
Tel: (503) 627-0201 
FAX: (503) 641-4012 

Wyle 

9640 Sunshine Court 
Bldg. G, Suite 200 
Beaverton 97005 
Tel: (503) 643-7900 
FAX: (503) 646-5466 

PENNSYLVANIA 

Avnet Computer 

213 Executive Drive, #320 

Mars 16046 

Tel: (412) 772-1888 

FAX: (412)772-1890 

Hamilton/Avnet Electronics 
213 Executive, #320 
Mars 16045 
Tel: (412) 281-4152 
FAX: (412) 772-1890 



Pioneer/Technologies Group, Inc. 
259 Kappa Drive 
Pittsburgh 15238 
Tel: (412) 782-2300 
FAX: (412)963-8255 

tPioneer/Technologies Group, Inc. 

500 Enterprise Road 

Keith Valley Business Center 

Horsham 19044 

Tel: (215) 674-4000 

FAX: (215) 674-3107 

TENNESSEE 

Arrow Commercial System Group 
3635 Knight Road, #7 
Memphis 38118 
Tel: (901) 367-0540 
FAX: (90.1)367-2081 

TEXAS 

Arrow Electronics, Inc. 
3220 Commander Drive 
Carrollton 75006 
Tel: (214) 380-6464 
FAX: (214) 248-7208 

Avnet Computer 

4004 Beltline, Suite 200 

Dallas 75244 

Tel: (214) 308-8181 

FAX: (214) 308-8129 

Avnet Computer 

1235 North Loop West, #525 

Houston 77008 

Tel: (713) 867-7500 

FAX: (713)861-6851 

tHamilton/Avnet Electronics 
1826-F Kramer Lane 
Austin 78758 
Tel: (800) 772-5668 
FAX: (512) 832-4315 

tHamilton/Avnet Electronics 
4004 Beltline, #200 
Dallas 75244 
Tel: (214) 308-8111 
FAX: (214)308-8109 

tHamilton/Avnet Electronics 
1235 N. Loop West, #521 
Houston 77008 
Tel: (713) 240-7733 
FAX: (713) 861-6541 

tPioneer/Standard Electronics 

1826-D Kramer Lane 

Austin 78758 

Tel: (512) 835-4000 

FAX: (512) 835-9829 

tPioneer/Standard Electronics 

13765 Beta Road 

Dallas 75244 

Tel: (214) 386-7300 

FAX: (214)490-6419 

tPioneer/Standard Electronics 
10530 Rockley Road, #100 
Houston 77099 
Tel: (713) 495-4700 
FAX: (713)495-5642 

tWyle Distribution Group 
1810 Greenville Avenue 
Richardson 75081 
Tel: (214) 235-9953 
FAX: (214) 644-5064 

Wyle Distribution Group 
4030 West Braker Lane, #330 
Austin 78758 ' 
Tel: (512) 345-8853 
FAX: (512)345-9330 

Wyle Distribution Group 
11001 South Wilcrest, #100 
Houston 77099 
Tel: (713) 879-9953 
FAX: (713) 879-6540 

UTAH 

Arrow Electronics, Inc. 
1946W. Parkway Blvd. 
Salt Lake City 84119 
Tel: (801) 973-6913 



Avnet Computer 
1100 E. 6600 South, #150 
Salt Lake City 84121 
Tel: (801) 266-1115 
FAX: (801) 266-0362 

Avnet Computer 

17761 Northeast 78th Place 

Redmond 98052 

Tel: (206) 867-0160 

FAX: (206) 867-0161 

tHamilton/Avnet Electronics 
1100 East 6600 South, #120 
Salt Lake City 84121 
Tel: (801)972-2800 
FAX: (801) 263-0104 

tWyle Distribution Group 
1325 West 2200 South, #E 
West Valley 841 19 
Tel: (801) 974-9953 
FAX: (801) 972-2524 

WASHINGTON 

tAlmac Electronics Corp. 
14360 S.E. EastgateWay 
Bellevue 98007 
Tel: (206) 643-9992 
FAX: (206) 643-9709 

tHamilton/Avnet Electronics 
17761 N.E. 78th Place, #C 
Redmond 98052 
Tel: (206) 241-8555 
FAX: (206) 241-5472 

Wyle Distribution Group 
15385 N.E. 90th Street 
Redmond 98052 
Tel: (206)881-1150 
FAX: (206) 881-1567 

WISCONSIN 

Arrow Electronics, Inc. 

200 N. Patrick Blvd., Ste. 100 

Brookfield 53005 

Tel: (414) 792-0150 

FAX: (414) 792-0156 

Avnet Computer 

20875 Crossroads Circle, #400 

Waukesha 53186 

Tel: (414) 784-8205 

FAX: (414) 784-6006 

tHamilton/Avnet Electronics 
28875 Crossroads Circle, #400 
Waukesha 53186 
Tel: (414) 784-4510 
FAX: (414) 784-9509 

Pioneer/Standard Electronics 
120 Bishops Way #163 
Brookfield 53005 
Tel: (414) 784-3480 

ALASKA 

Avnet Computer 
1400 West Benson Blvd. 
Suite 400 
Anchorage 99503 
Tel: (907) 274-9899 
FAX: (907) 277-2639 



CANADA 



ALBERTA 

Avnet Computer 

2816 21st Street Northeast 

Calgary T2E 6Z2 

Tel: (403) 291-3284 

FAX: (403) 250-1591 

Zentronics 

6815 8th Street N.E., #100 

Calgary T2E 7H 

Tel: (403) 295-8838 

FAX: (403) 295-8714 

BRITISH COLUMBIA 

tHamilton/Avnet Electronics 
8610 Commerce Court 
Burnaby V5A 4N6 
Tel: (604) 420-4101 
FAX: (604) 420-5376 



Zentronics 

11400 Bridgeport Rd., #108 

Richmond V6X 1T2 

Tel: (604) 273-5575 

FAX: (604) 273-2413 

ONTARIO 

Arrow Electronics, Inc. 
36 Antares Dr., Unit 100 
Nepean K2E 7W5 
Tel: (613) 226-6903 
FAX: (613) 723-2018 

tArrow Electronics, Inc. 
1093 Meyerside, Unit 2 
Mississauga L5T 1M4 
Tel: (416) 670-7769 
FAX: (416) 670-7781 

Avnet Computer 

Canada System Engineering 

Group 

3688 Nashua Dr., Unit 6 

Mississuaga L4V 1M5 

Tel: (416) 672-8638 

FAX: (416) 677-5091 

Avnet Computer 
6845 Rexwood Road 
Units 7-9 

Mississuaga L4V 1M4 
Tel: (416)672-8638 
FAX: (416) 672-8650 

Avnet Computer 
190 Colonade Road 
Nepean K2E 7J5 
Tel: (613) 727-7529 
FAX: (613) 226-1184 

tHamilton/Avnet Electronics 
6845 Rexwood Rd., Units 3-5 , •', 
Mississauga L4T 1R2 
Tel: (416) 677-7432 
FAX: (416) 677-0940 

tHamilton/Avnet Electronics 
190 Colonade Road 
Nepean K2E 7J5 
Tel: (613) 226-1700 
FAX: (613) 226-1184 

tZentronics 
1355 Meyerside Drive 
Mississauga L5T 1C9 
•Tel: (416) 564-9600 
FAX: (416) 564-3127 . 

tZentronics 

155 Colonade Rd., South 

Unit 17 

Nepean K2E 7K1 

Tel: (613) 226-8840 

FAX: (613) 226-6352 

QUEBEC 

Arrow Electronics Inc. 
1100 St. Regis Blvd. 
DorvalH9P2T5 
. Tel: (514) 421-7411 
FAX: (514) 421-7430 

Arrow Electronics, Inc. 

500 Boul. St-Jean-Baptiste Ave. 

Quebec H2E 5R9 

Tel: (418) 871-7500 

FAX: (418) 871-6816 

Avnet Computer 
2795 Rue Halpern 
St. Laurent H4S 1P8 
Tel: (514) 335-2483 
FAX: (514) 335-2481 

tHamilton/Avnet Electronics 
2795 Halpern 
St. Laurent H4S 1P8 
Tel: (514) 335-1000 
FAX: (514) 335-2481 

tZentronics 
520 McCaffrey 
St. Laurent H4T 1 N3 
Tel: (514) 737-9700 
FAX: (514) 737-5212 
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EUROPEAN SALES OFFICES 



dLAND 

tol Finland OY 
,uosilantie2 
K)390 Helsinki 
fol. (358) 544 644 
FAX: (358) 544 030 

FRANCE 

Intel Corporation S.A.R.L. 

1 , Rue Edison-BP 303 

78054 St. Quentin-en-Yvelines 

Cedex 

Tel: (33) (1)30 57 70 00 

FAX: (33) (1)30 64 60 32 



GERMANY 

Intel GmbH 

Dornacher Strasse 1 

8016 Feldkirchen bei Muenchen 

Tel: (49) 089/90992-0 

FAX: (49) 089/9043948 

ISRAEL 

Intel Semiconductor Ltd. 

Atidim Industrial Park-Neve Sharet 

P.O. Box 43202 

Tel-Aviv 61430 

Tel: (972) 03 498080 

FAX: (972)03 491870 



ITALY 

Intel Corporation Italia S.p.A. 

Milanofiori Palazzo E 

20094 Assago 

Milano 

Tel: (39) (02) 89200950 

FAX: (39) (2) 3498464 

NETHERLANDS 

Intel Semiconductor B.V. 
Postbus 84130 
3009 CC Rotterdam 
Tel: (31) 10 40711 11 
FAX: (31) 10 455 4688 



SPAIN 

Intel Iberia S.A. 
Zubaran, 28 
28010 Madrid 
Tel: (34) 308 25 52 
FAX: (34)410 7570 

SWEDEN 

Intel Sweden A.B. 
Dalvagen 24 
171 36Solna 
Tel: (46) 8 734 01 00 
FAX: (46)8 278085 



UNITED KINGDOM 

Intel Corporation (U.K.) Ltd. 
Pipers Way 

Swindon, Wiltshire SN3 1RJ 
Tel: (44) (0793) 696000 
FAX: (44) (0793)641440 



EUROPEAN DISTRIBUTORS/REPRESENTATIVES 



AUSTRIA. 

Bacher Electronics GmbH 

Rotenmuehlgasse 26 

A-1l20Wien 

Tel: 43 222 81356460 

FAX: 43 222 834276 

BELGIUM 

Inelco Belgium S.A. 
Oorlogskruisenlaan 94 
B-1120 Bruxelles 
Tel: 32 2 244 2811 
FAX: 32 2 216 4301 

FRANCE 

Almex 

48, Rue de I'Aubepine 

B.P. 102 

92164 Antony Cedex 

Tel: 33 1 4096 5400 

FAX: 33 1 4666 6028 

Lex Electronics 
Silic 585 

60 Rue des Gemeaux 
94663 Rungis Cedex 
Tel: 33 1 4978 4978 
FAX: 33 1 4978 0596 

Metrologie 
Tour d'Asnieres 
4, Avenue Laurent Cely 
92606 Asnieres Cedex 
Tel: 33 1 4790 6240 
FAX: 33 1 4790 5947 

Tekelec-Airtronic 

Cite Des Bruyeres 

Rue Carle Vernet 

BP2 

92310 Sevres 

Tel: 33 1 4623 2425 

FAX: 33 1 4507 2191 

GERMANY 

E2000 Vertriebs-AG 
Stahlgruberring 12 ; 
8000 Muenchen 82 
Tel: 49 89 420010 
FAX: 49 89 4200 1209 

Jermyn GmbH 
Im Dachsstueck 9 
6250 Limburg 
Tel: 49 6431 5080 
FAX: 49 6431 508289 

Metrologie GmbH 
Steinerstrasse 15 
8000 Muenchen 70 
Tel: 49 89 724470 
FAX: 49 89 724471 11 



Proelectron Vertriebs GmbH 
Max-Planck-Strasse 1-3 
6072 Dreieich 
Tel: 49 6103 304343 
FAX: 49 6103 304425 

Rein Electronik GmbH 
Loetscher Weg 66 
4054 Nettetal 1 
Tel: 49 2153 7330 
FAX: 49 2153 733513 



GREECE 

Pouliadis Associates Corp. 
5 Koumbari Street 
Kolonaki Square 
10674 Athens 
Tel: 30 1 360 3741 
FAX: 30 1 360 7501 



IRELAND 

Micro Marketing 

Tany Hall 

Eglinton Terrace 

Dundrum 

Dublin 

Tel: 0001 989 400 

FAX: 0001 989 8282 



ISRAEL 

Eastronics Ltd. 
Rozanis 11 
P.O.B. 39300 
Tel Baruch 
Tel-Aviv 61392 
Tel: 972 3 475151 
FAX: 972 3 475125 



ITALY 

Celdis Spa 

Via F.11i Gracchi 36 

20092 Cinisello Balsamo 

Milano 

Tel: 39 2 66012003 

FAX: 39 2 6182433 

Intesi Div. Delia Deutsche 
Divisione ITT 
Industries GmbH 
P.I. 06550110156 
Milanofiori Palazzo E5 
20094 Assago (Milano) 
Tel: 39 2 824701 
FAX: 39 2 8242631 



Lasi Elettronica S.p.A. 
P.I. 00839000155 
Viale Fulvio Testi, N.280 
20126 Milano 
Tel: 39 2 66101370 
FAX: 39 2 66101385 

Telcom s.r.l. — Divisione MDS 

Via Trombetta 

Zona Marconi 

Sfrada Cassanese 

Segrate- Milano 

Tel: 39 2 2138010 

FAX: 39 2 216061 

NETHERLANDS 

Koning en Hartman B.V. 
Energieweg 1 
2627 AP Delft 
Tel: 31 15 609 906 
FAX: 31 15 619 194 



PORTUGAL 

ATD Electronica LDA 
Rua Dr. Faria de 
Vasconcelos, 3a 
1900 Lisboa 
Tel: 351 1 8472200 
FAX: 351 1 8472197 



SPAIN 

ATD Electronica 

Plaza Ciudad de Viena, 6 

28040 Madrid 

Tel: 34 1 534 4000/09 

FAX: 34 1 534 7663 

Metrologia Iberica 

Ctra De Fuencarral N.80 

28100 Alcobendas 

Madrid 

Tel: 34 1 6538611 

FAX: 34 1 6517549 



SCANDINAVIA 

OY Fintronic AB 
Heikkilantie 2a 
SF-00210 Helsinki 
Tel: 358 6926022 
FAX: 358 6821251 



ITT Multikomponent A/S 

Naverland 29 

DK-2600 Glostrup 

Denmark 

Tel: 010 45 42 451822 

FAX: 010 45 42 457624 

Nordisk Elektronik A/S 

Postboks122 

Smedsvingen 4 

N-1364Hvalstad 

Norway 

Tel: 47 2 846210 

FAX: 47 2 846545 

Nordisk Electronik AB 

Box 36 

Torshamnsgatan 39 

S-16493Kista 

Sweden 

Tel: 46 8 7034630 

FAX: 46 8 7039845 



SWITZERLAND 

Industrade A.G. 
Hertistrasse 31 
CH-8304 Wallisellen 
Tel: 41 1 8328111 
FAX: 41 1 8307550 



TURKEY 

EMPA 

80050 Sishane 

Refik Saydam Cad No. 89/5 

Istanbul 

Tel: 90 1 143 6212 

FAX: 90 1 143 6547 



UNITED KINGDOM 

Access Elect Comp Ltd. 

Jubilee House 

Jubilee Road 

Letchworth 

Hertfordshire 

SG6 1QH 

Tel: 0462 480888 

FAX: 0462 682467 

Bytech Components Ltd. 
1 2a Cedarwood 
Chineham Business Park 
Crockford Lane 
Basingstoke 
HantsRG12 1RW 
Tel: 0256 707107 
FAX: 0256 707162 



Bytech Systems 
Unit 3 . 

The Western Centre 
Western Road 
Bracknell 
BerksRG12 1RW 
Tel: 0344 55333 
FAX: 0344 867270 

Metrologie 
Rapid House 
Oxford Road 
High Wycombe 
Bucks 

Herts HP1 1 2EE 
Tel: 0494 474147 
FAX: 0494 452144 

Jermyn 
Vestry Estate 
Otford Road 
Sevenoaks 
KentTN14 5EU 
Tel: 0732 4501 44 
FAX: 0732 451251 

MMD 

3 Bennet Court 

Ben net Road 

Reading 

Berkshire RG2 0QX 

Tel: 0734 313232 

FAX: 0734 313255 

Rapid Silicon 
3 Bennet Court 
Bennet Road 
Reading 
Berks RG2 0QX 
Tel: 0734 752266 
FAX: 0734 312728 

Metro Systems 
Rapid House 
Oxford Road 
High Wycombe 
Bucks HP11 2EE 
Tel: 0494 474171 
FAX: 0494 21860 



YUGOSLAVIA 

H.R. Microelectronics Corp. 

2005 de la Cruz Blvd. 

Suite 220 

Santa Clara, CA 95050 

U.S.A. 

Tel: (408) 988-0286 

FAX: (408) 988-0306 
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INTERNATIONAL SALES OFFICES 



AUSTRALIA 

Intel Australia Pty. Ltd. 

Unit 13 

Allambie Grove Business Park 

25 Frenchs Forest Road East 

Frenchs Forest, NSW, 2086 

Sydney 

Tel: 61-2-975-3300 

FAX: 61-2-975-3375 

Intel Australia Pty. Ltd. 

711 High Street 

1st Floor 

East Kw. Vic, 3102 

Melbourne 

Tel: 61-3-810-2141 

FAX: 61-3-819 7200 

BRAZIL 

Intel Semiconductores do Brazil LTDA 
Avenida Paulista, 1159-CJS 404/405 
01311 - Sao Paulo- S.P. 
Tel: 55-11-287-5899 
TLX: 11-37-557-ISDB 
FAX: 55-11-287-5119 

CHINA/HONG KONG 

Intel PRC Corporation 
15/F, Office 1, CiticBldg. 
Jian Guo Men Wai Street 
Beijing, PRC 
Tel: (1) 500-4850 
TLX: 22947 INTEL CN 
FAX: (1) 500-2953 



Intel Semiconductor Ltd.* 
1 0/F East Tower 
Bond Center 
Queensway, Central 
Hong Kong 
Tel: (852) 844-4555 
FAX: (852) 868-1989 



INDIA 

Intel Asia Electronics, Inc. 

4/2, Samrah Plaza 

St. Mark's Road 

Bangalore 560001 

Tel: 91-812-215773 

TLX: 953-845-2646 INTEL IN 

FAX: 091-812-215067 



JAPAN 

Intel Japan K.K. 

5-6 Tokodai, Tsukuba-shi 

Ibaraki, 300-26 

Tel: 0298-47-8511 

FAX: 0298-47-8450 

Intel Japan K.K.* 
Hachioji ON Bldg. 
4-7-14 Myojin-machi 
Hachioji-shi, Tokyo 192 
Tel: 0426-48-8770 
FAX: 0426-48-8775 



Intel Japan K.K.* 
Bldg. Kumagaya 
2-69 Hon-cho 

Kumagaya-shi, Saitama 360 
Tel: 0485-24-6871 
FAX: 0485-24-7518 

Intel Japan K.K.* 
Kawa-asa Bldg. 
2-11-5 Shin-Yokohama 
Kohoku-ku, Yokohama-shi 
Kanagawa, 222 
Tel: 045-474-7661 
FAX: 045-471-4394 

Intel Japan K.K.* 
Ryokuchi-Eki Bldg. 
2-4-1 Terauchi 
Toyonaka-shi, Osaka 560 
Tel: 06-863-1091 
FAX: 06-863-1084 

Intel Japan K.K. 
Shinmaru Bldg. 
1-5-1 Marunouchi 
Chiyoda-ku, Tokyo 100 
Tel: 03-3201-3621 
FAX: 03-3201-6850 

Intel Japan K.K. 
Green Bldg. 
1-16-20 Nishiki 
Naka-ku, Nagoya-shi 
Aichi 460 
Tel: 052-204-1261 
FAX: 052-204-1285 



KOREA 

Intel Korea, Ltd. 

16th Floor, Life Bldg. 

61 Yoido-dong, Youngdeungpo-Ku 

Seoul 150-010 

Tel: (2)784-8186 

FAX: (2) 784-8096 



SINGAPORE 

Intel Singapore Technology, Ltd. 
101 Thomson Road #08-03/06 
United Square 
Singapore 1130 
Tel: (65) 250-7811 
FAX: (65) 250-9256 



TAIWAN 

Intel Technology Far East Ltd. 

Taiwan Branch Office 

8th Floor, No. 205 

Bank Tower Bldg. 

Tung Hua N. Road 

Taipei 

Tel: 886-2-5144202 

FAX: 886-2-717-2455 



INTERNATIONAL BIISTRIBUTORS/REPRESENTATflVES 



ARGENTINA 

Dafsys S.R.L. 
Chacabuco, 90-6 Piso 
1069-Buenos Aires 
Tel: 54-1-34-7726 
FAX: 54-1-34-1871 

AUSTRALIA 

Email Electronics 
15-17 Hume Street 
Huntingdale, 3166 
Tel: 011-61-3-544-8244 
TLX: AA 30895 
FAX: 011-61-3-543-8179 

NSD-Australia 
205 Middleborough Rd. 
Box Hill, Victoria 3128 
Tel: 03 8900970 
FAX: 03 8990819 

BRAZIL 

Microlinear 

Largo do Arouche, 24 
01219 Sao Paulo, SP 
Tel: 5511-220-2215 
FAX: 5511-220-5750 

CHILE 

Sisteco 

Vecinal 40 -Las Condes 

Santiago 

Tel: 562-234-1644 

FAX: 562-233-9895 

CHINA/HONG KONG 

Novel Precision Machinery Co., Ltd. 

Room 728 Trade Square 

681 Cheung Sha Wan Road 

Kowloon, Hong Kong 

Tel: (852) 360-8999 

TWX: 32032 NVTNL HX 

FAX: (852) 725-3695 

GUATEMALA 

Abinitio 

11 Calle2-Zona9 
Guatemala City 
Tel: 5022-32-4104 
FAX: 5022-32-4123 



INDIA 

Micronic Devices 
Arun Complex 
No. 65 D.V.G. Road 
Basavanagudi 
Bangalore 560 004 
Tel: 011-91-812-600-631 
011-91-812-611-365 
TLX: 9538458332 MDBG 

Micronic Devices 

No. 516 5th Floor 

Swastik Chambers 

Sion, Trombay Road 

Chembur 

Bombay 400 071 

TLX: 9531 171447 MDEV 

Micronic Devices 
25/8, 1st Floor 
Bada Bazaar Marg 
Old Rajinder Nagar 
New Delhi 110 060 
Tel: 011-91-11-5723509 

011-91-11-589771 
TLX: 031-63253 MDND IN 

Micronic Devices 

6-3-348/1 2A Dwarakapuri Colony 

Hyderabad 500 482 

Tel: 011-91-842-226748 

S&S Corporation 
1587 Kooser Road 
San Jose, CA 95118 
Tel: (408) 978-6216 
TLX: 820281 
FAX: (408) 978-8635 

JAMAICA 

MC Systems 
10-12 Grenada Crescent 
Kingston 5 
Tel: (809) 929-2638 
(809) 926-0188 
FAX: (809)926-0104 

JAPAN 

Asahi Electronics Co. Ltd. 
KMM Bldg. 2-14-1 Asano 
Kokurakita-ku 
Kitakyushu-shi 802 
Tel: 093-511-6471 
FAX: 093-551-7861 



CTC Components Systems Co., Ltd. 
4-8-1 Dobashi, Miyamae-ku 
Kawasaki-shi, Kanagawa 213 
Tel: 044-852-5121 
FAX: 044-877-4268 

Dia Semicon Systems, Inc. 

Flower Hill Shinmachi Higashi-kan 

1-23 Shinmachi, Setagaya-ku 

Tokyo 154 

Tel: 03-3439-1600 

FAX: 03-3439-1601 

Okaya Koki 
2-4-18 Sakae 
Naka-ku, Nagoya-shi 460 
Tel: 052-204-8315 
FAX: 052-204-8380 

Ryoyo Electro Corp. 
Konwa Bldg. 
1-12-22 Tsukiji 
Chuo-ku, Tokyo 104 
Tel: 03-3546-5011 
FAX: 03-3546-5044 

KOREA 

J-Tek Corporation 

Dong Sung Bldg. 9/F 

158-24, Samsung-Dong, Kangnam-Ku 

Seoul 135-090 

Tel: (822) 557-8039 

FAX: (822) 557-8304 

Samsung Electronics 

Samsung Main Bldg. 

150 Taepyung-Ro-2KA, Chung-Ku 

Seoul 100-102 

C.P.O. Box 8780 

Tel: (822) 751-3680 

TWX: KORSST K 27970 

FAX: (822) 753-9065 

MEXICO 

PSI S.A. de C.V. 
Fco. Villa esq. Ajusco s/n 
Cuernavaca, MOR 62130 
Tel: 52-73-13-9412 
52-73-17-5340 
FAX: 52-73-17-5333 

NEW ZEALAND 

Email Electronics 
36 Olive Road 
Penrose, Auckland 
Tel: 011-64-9-591-155 
FAX: 011-64-9-592-681 



SAUDI ARABIA 

AAE Systems, Inc. 

642 N. Pastoria Ave. 

Sunnyvale, CA 94086 

U.S.A. 

Tel: (408) 732-1710 

FAX: (408) 732-3095 

TLX: 494-3405 AAE SYS 

SINGAPORE 

Electronic Resources Pte, Ltd. 
17 Harvey Road 
#03-01 Singapore 1336 
Tel: (65) 283-0888 
TWX: RS 56541 ERS 
FAX: (65) 289-5327 

SOUTH AFRICA 

Electronic Building Elements 

178 Erasmus St. (off Watermeyet St.) 

Meyerspark, Pretoria, 0184 

Tel: 011-2712-803-7680 

FAX: 011-2712-803-8294 

TAIWAN 

Micro Electronics Corporation 

12th Floor, Section 3 

285 Nanking East Road 

Taipei, R.O.C. 

Tel: (886) 2-7198419 

FAX: (886)2-7197916 

Acer Sertek Inc. 
15th Floor, Section 2 
Chien Kuo North Rd. 
Taipei 18479 R.O.C. 
Tel: 886-2-501-0055 
TWX: 23756 SERTEK 
FAX: (886) 2-5012521 

URUGUAY 

Interfase 
Zabala 1378 
11000 Montevideo 
Tel: 5982-96-0490 
5982-96-1143 
FAX: 5982-96-2965 

VENEZUELA 

Unixel CA. 

4 Transversal de Monte Cristo 

Edf. AXXA, Piso 1 , of. 1 &2 

Centro Empresarial Boleita 

Caracas 

Tel: 582-238-6082 

FAX: 582-238-1816 



*Field Application Location 
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NORTH AMERICAN SERVICE OFFICES 



ALASKA 

Intel Corp. 

c/o TransAlaska Network 
1515 Lore Rd. 
Anchorage 99507 
Tel: (907) 522-1776 

Intel Corp. 

c/o TransAlaska Data Systems 

c/o GCI Operations 

520 Fifth Ave., Suite 407 

Fairbanks 99701 

Tel: (907) 452-6264 

ARIZONA 

"Intel Corp. 
410 North 44th Street 
Suite 500 
Phoenix 85O08 
Tel: (602) 231-0386 
FAX: (602) 244-0446 

"Intel Corp. 

500 E. Fry Blvd., Suite M-15 - 

Sierra Vista 85635 

Tel: (602)459-5010 

ARKANSAS 

Intel Corp. 
c/o Federal Express 
1 500 West Park Drive 
Little Rock 72204 

CALIFORNIA 

•Intel Corp. 

21515 Vanowen St., Ste. 116 

Canoga Park 91303 

Tel: (818) 704-8500 

*lntel Corp. 

300 N. Continental Blvd. 

Suite 100 

El Segundo 90245 

Tel: (213) 640-6040 

*lntel Corp. 
1900 Prairie City Rd. 
Folsom 95630-9597 
Tel: (916) 351-6143 

*lntel Corp. 

9665 Chesapeake Dr., Suite 325 

San Diego 921 23 

Tel: (619) 292-8086 

** Intel Corp. 

400 N. Tustin Avenue 

Suite 450 

Santa Ana 92705 

Tel: (714) 835-9642 

**lntel Corp. 

2700 San Tomas Exp., 1st Floor 

Santa Clara 95051 

Tel: (408) 970-1747 

COLORADO 

*lntel Corp. 

600 S. Cherry St., Suite 700 ■ 

Denver 80222 

Tel: (303)321-8086 



CONNECTICUT 

*lntel Corp. 

301 Lee Farm Corporate Park 

83 Wooster Heights Rd. 

Danbury 06811 

Tel: (203) 748-3130 

FLORIDA 

**lntel Corp. 

800 Fairway Dr., Suite 160 
Deerfield Beach 33441 
Tel: (305) 421-0506 
FAX: (305) 421-2444 

Mntel Corp. 

5850 T.G. Lee Blvd., Ste. 340 

Orlando 32822 

Tel: (407) 240-8000 

GEORGIA 

*lntel Corp. 

20 Technology Park, Suite 150 

Norcross 30092 

Tel: (404) 449-0541 

5523 Theresa Street 
Columbus 31907 



**lntel Corp. 
Honolulu 96820 
Tel: (808) 847-6738 

ILLINOIS 

**tlntel Corp. 

Woodfield Corp. Center III 

300 N. Martingale Rd., Ste. 400 

Schaumburg 60173 

Tel: (708) 605-8031 

INDIANA 

*lntel Corp. 

8910 Purdue Rd., Ste. 350 
Indianapolis 46268 
Tel: (317) 875-0623 

KANSAS 

*lntel Corp. 

10985 Cody, Suite 140 
Overland Park 66210 
Tel: (913) 345-2727 

KENTUCKY 

Intel Corp. 

133 Walton Ave., Office 1 A 

Lexington 40508 

Tel: (606) 255-2957 

Intel Corp. 

896 Hillcrest Road, Apt. A 

Radcliff 40160 (Louisville) 



Hammond 70401 

(serviced from Jackson, MS) 



MARYLAND 

**lntel Corp. 

10010 Junction Dr., Suite 200 
Annapolis Junction 20701 
Tel: (301) 206-2860 

MASSACHUSETTS 

**lntel Corp. 
Westford Corp. Center 
3 Carlisle Rd., 2nd Floor 
Westford 01886 
Tel: (508) 692-0960 

MICHIGAN 

*lntel Corp. 

7071 Orchard Lake Rd., Ste. 100 

West Bloomfield 48322 

Tel: (313)851-8905 

MINNESOTA 

*lntel Corp. 

3500 W. 80th St., Suite 360 

Bloomington 55431 

Tel: (612) 835-6722 

MISSISSIPPI 

Intel Corp. 

c/o Compu-Care 

2001 Airport Road, Suite 205F 

Jackson 39208 

Tel: (601) 932-6275 

MISSOURI 

*lntel Corp. 

3300 Rider Trail South 

Suite 170 

Earth City 63045 

Tel: (314) 291-1990 

Intel Corp. 
Route 2, Box 221 
Smithville 64089 
Tel: (913) 345-2727 

NEW JERSEY 

**lntel Corp. 
300 Sylvan Avenue 
Englewood Cliffs 07632 
Tel: (201) 567-0821 

*lntel Corp. 
Lincroft Office Center 
125 Half Mile Road 
Red Bank 07701 
Tel: (908) 747-2233 

NEW MEXICO 

Intel Corp. 

Rio Rancho 1 

4100 Sara Road 

Rio Rancho 87124-1025 

(near Albuquerque) 

Tel: (505) 893-7000 



NEW YORK 

*lntel Corp. 

2950 Expressway Dr. South 

Suite 130 

Islandia 11722 

Tel: (516)231-3300 

Intel Corp. 

300 Westage Business Center 

Suite 230 

Fishkill 12524 

Tel: (914) 897-3860 

Intel Corp. 

5858 East Molloy Road 
Syracuse 13211 
Tel: (315) 454-0576 

NORTH CAROLINA 

*lntel Corp. 

5800 Executive Center Drive 

Suite 105 

Charlotte 28212 

Tel: (704) 568-8966 

**lntel Corp. 

5540 Centerview Dr., Suite 215 

Raleigh 27606 

Tel: (919) 851-9537 

OHIO 

**lntel Corp. 

3401 Park Center Dr., Ste. 220 

Dayton 45414 

Tel: (513) 890-5350 

*lntel Corp. 

25700 Science Park Dr., Ste. 100 

Beachwood 44122 

Tel: (216)464-2736 

OREGON 

**lntel Corp. 

15254 N.W. Greenbrier Pkwy. 

Building B 

Beaverton 97006 

Tel: (503) 645-8051 

PENNSYLVANIA 

*tlntel Corp. 
925 Harvest Drive 
Suite 200 
Blue Bell 19422 
Tel: (215) 641-1000 
1-800-468-3548 
FAX: (215)641-0785. 

**tlntel Corp. 

400 Penn Center Blvd., Ste. 610 

Pittsburgh 15235 

Tel: (412) 823-4970 

*lntel Corp. 
1513 Cedar Cliff Dr. 
Camp Hill 17011 
Tel: (717) 761-0860 



PUERTO RICO 

Intel Corp. 

South Industrial Park 
P.O. Box 910 
Las Piedras 00671 
Tel: (809) 733-8616 

TEXAS 

**lntel Corp. 

Westech 360, Suite 4230 

891 1 N. Capitol of Texas Hwy. 

Austin 78752-1239 

Tel : (512) 794-8086 

**tlntel Corp. 

12000 Ford Rd., Suite 401 

Dallas 75234 

Tel: (214)241-8087 

**lntel Corp. 

7322 SW Freeway, Suite 1490 

Houston 77074 

Tel: (713) 988-8086 

UTAH 

Intel Corp. 

428 East 6400 South . 

Suite 104 

Murray 84107 

Tel: (801) 263-8051 

FAX: (801) 268-1457 . 

VIRGINIA 

*lntel Corp. 

9030 Stony Point Pkwy. 
Suite 360 
Richmond 23235 
Tel: (804) 330-9393 

WASHINGTON 

**lntel Corp. 

155 108th Avenue N.E., Ste. 386 

Bellevue 98004 

Tel: (206) 453-8086 

CANADA 

ONTARIO 

**lntel Semiconductor of 

Canada, Ltd. 

2650 Queensview Dr., Ste. 250 

Ottawa K2B 8H6 

Tel: (613) 829-9714 

**lntel Semiconductor "of 
Canada, Ltd. 
190 AttwellDr., Ste. 102 
Rexdale (Toronto) M9W 6H8 
Tel: (416) 675-2105 

QUEBEC 

**lntel Semiconductor of 

Canada, Ltd. 

1 Rue Holiday 

Suite 115 

Tour East 

Pt. Claire H9R 5N3 

Tel: (514)694-9130 

FAX: 514-694-0064 



ARIZONA 

2402 W. Beardsley Road 
Phoenix 85027 
Tel: (602) 869-4288 
1 -800-468-3548 



CUSTOMER TRAINING CENTERS 



SYSTEMS ENGINEERING OFFICES 



MINNESOTA 

3500 W. 80th Street 
Suite 360 
Bloomington 55431 
Tel: (612) 835-6722 



2950 Expressway Dr., South 
Islandia 11722 
Tel: (506) 231-3300 



*Carry-in locations 
**Carry-in/mail-in locations 



Multimedia and 
Supercomputing Processors 

Intel Corporation's Multimedia and 
Supercomputing Components Group products 
enrich computerized information and exchange 
technologies in imaginative new ways never 
before possible. To learn more about Intel's 
problem-solving MSCG products: The i750® 
video processor, and the i860™ and i960™ 
microprocessor families, you will want to read 
this publication, 
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